The surface waters of the Northwest Atlantic Ocean are among the fastest warming on Earth. This area is highly-productive biologically, and there are concerns that ecological consequences will follow this rapid-warming. Research on the impacts of this rapid warming has primarily focused on high-profile and/or upper trophic level species. Ecological theory and laboratory studies suggest that elevated temperatures facilitate early maturation and smaller adult body-sizes. However, it is unclear whether that relationship might be mitigated against through adaptive behaviors in an open ocean environment. Here we’ve investigated ecosystem wide impacts on the individual size distribution (ISD) to track changes in community size structure. In cases where community responses are not adequate to counter the impacts of elevated temperatures, we anticipated a steepening of the size spectrum slope (ISD exponent). A steeper relationship relating to a reduction in larger sized individuals and an increased prevalence of smaller sized individuals. Using data from fisheries independent surveys we calculated the community size spectra for four regions along the US NE continental shelf. Correlation/regression analyses were then performed to then assess the degree to which these changes were in alignment to hypothesized bottom-up and top-down disturbances. At the regional scale, we found that community size structure changes (spectra slope) were the largest in the Northern regions, in the Gulf of Maine and Georges Bank. These areas are home to the coldest temperatures and the largest proportions of groundfish species in the community. Spectrum slope declines were most pronounced in the 80’s and 90’s, before the rapid warming of the last decade. The timing of these declines suggest that external factors drove the initial declines of larger-sized individuals within the communities, before elevated temperatures began to influence the ecosystem. Correlation analyses reveal that while fisheries landings are strongly correlated with these declines, bottom-up factors of zooplankton community metrics, Gulf Stream Index, and SST anomalies are also important. While the primary pressure of fisheries exploitation has declined dramatically over time, the recovery of larger-sized individuals has not been seen. That kind of recovery will likely depend on the elevated temperatures seen over the last decade.
Size is a defining characteristic of species and mediates many ecological interactions and metabolic pathways (Brown et al. (2000)) . Size is a big factor in determining the mobility of an organism. Mobility then impacts the ability to evade predation, foraging success, efforts to locate and follow essential habitats, geographic ranging, and the metabolic costs associated with each of these behaviors (Hillaert et al. (2018)). Body size also mediates vulnerability to aspects the immediate environment such as temperature through heat exchange or the threat of desiccation in terrestrial species (Gillooly et al. (2001); Heatwole et al. (1969)). Body size even informs life history features like life span and the trophic position an individual might occupy through its impact on metabolism and resource use (White et al. (2007)).
Size structured environments are a fundamental organizational pattern globally that emerges from these relationships add_citation. Within strongly size-structured ecosystems, growth and maturity changes alter fitness and ultimately determine whether a species is successful in that environment add_citation . Ecological theory is rich with models relating how energy transfers from smaller prey species to larger predatory trophic levels, the allocation of energy for growth, and the trade offs of allocating energy towards those ends (Bertalanffy (1938); Bertalanffy (1957); add_citations). A globally persistent pattern in ecology entangled in those relationships and their critiques is the decline in abundance with increasing body size (Damuth (1981); Currie (1993); Sheldon et al. (1972); Loeuille and Loreau (2006)).
This relationship, between size and abundance, integrates multiple processes operating on the cellular, individual, and community levels simultaneously. The quantities for size and abundance are also some of the most readily collected data assets of any ecological community. This creates an opportunity to learn much about a system from a relatively low-effort in data collection. For these reasons, size spectrum analyses and individual size distribution (ISD) methods have gained increasing attention as an entry point to assessing ecosystem health and to detect system-wide disturbance (Shin et al. (2005)‘; Pomeranz et al. (2022); White et al. (2007)). An advantage of these models is they avoid the need to explicitly articulate each predator-prey interactions as they and can be estimated from the commonly collected measures of abundance and size. The “size spectrum” describes the distribution of biomass or abundance as a function of individuals’ mass or size on a log–log scale (Guiet et al. (2016); Kerr and Dickie (2001)) . Size spectra are described by two terms, the size spectrum slope & intercept. These two terms reveal a sense of the baseline productivity, and how energy flows through an ecosystem (in the form of biomass) from many smaller individuals to many fewer large individuals. The spectrum intercept has been linked to the productivity of the community, and is often connected to the prevailing environmental conditions (Boudreau and Dickie (1992); Rossberg (2012)).
How are they used in practice:
Size spectra condense the complexities of predator prey networks and their interactions into a handful of size-based indices. These indices capture the emergent properties of a system, and have become increasingly used as indicators of ecosystem health. Within the context of fisheries management, changes in spectrum slopes have been associated with fishing exploitation, primarily through the targeted removal of larger individuals (Bianchi et al. (2000); Shin et al. (2005)). Numerical experiments have also linked changes in slope to environmental disturbances (Guiet et al. (2016)). Size spectra have also been shown to express predictable relationships between ecosystems of similar productivity levels as well as from distinct temperatures (Guiet et al. (2016)).
Use Pomerantz paper & Edwards to extend into ISD
Temperature & Ecology
Temperature plays a critical role on biological life impacting many of the chemical reactions that underpin basic physiological function. Temperature has direct and indirect impacts on critical biological functions including the acquisition of biomass through feeding, the rates of biomass loss through metabolism, and the rates at which individuals mature and develop. Because of these relationships, most species have evolved thermal preferences around which these chemical reactions are optimized. Species that are unable to maintain their thermal preferences internally must be able to follow their thermal preference in the environment through locomotion or adapt to less-favorable conditions through changes in behavior or risk metabolic costs in failing to do so. In an era of anthropogenic climate change, there is an expectation that many species will be displaced from historic habitats in their efforts to follow their thermal preferences. Recent research in marine environments has shown evidence of this as species are now shifting to higher latitudes and to deeper depths in the pursuit of more favorable conditions (Kleisner et al. (2017); Pinsky et al. (2013)). Others have suggested that temperature related impacts may not be seen through geographic distribution change, but through physiological changes, changes in seasonal phenology, or in dashed hopes of species recovery (Daan et al. (2005); Miller et al. (2018); Pershing et al. (2015); University of South Carolina et al. (2021)).
Need to connect temperature to size here
Direct quote from Guiet et al. (2016) , but nails the connection back to temp expectations:
Because it controls chemical reactions, temperature controls metabolic rates which underpin maintenance, growth or reproduction (Clarke and Johnston, 1999; Kooijman, 2010) as well as the functional responses to food density (Rall et al., 2012). Guiet et al. (2016)… In addition to the impact of temperature on communities’ intercepts (heights), the impact of temperature on the speed of the energy flow within communities may affect other properties, such as their resilience to perturbations or the intensity of trophic cascades (Andersen and Pedersen, 2009).
The potential for elevated temperatures to impact the size structure of an ecosystem has implications for the ecosystem resilience in the face of climate change, as well as the blue economies & natural resource systems that rely upon their good health.
Temperature of the Gulf of Maine & NE Shelf
In addition to the ecological disturbances of industrial fisheries, the Northwest Atlantic is also one of the fastest warming locations in the global oceans. Sea surface temperatures in the Gulf of Maine since 1982 have been warming at rates faster than 96% of the world’s oceans, with similar warming rates along the northwest Atlantic shelf (Pershing et al. (2018)). This persistent elevated temperature regime of the area is a result of several forces, a combination of shifting ocean currents and the unique bathymetry of the region. A Northward shift in the Gulf Stream directly increased the regional temperatures through increased transport of warm Gulf Stream water into areas like the Gulf of Maine. The Northward Gulf Stream shift is associated with a higher frequency of warm core rings, and the obstruction of cold-water Scotian Shelf current flow that would otherwise counter the influence of the Gulf Stream on the region’s temperatures (Gangopadhyay et al. (2019); University of South Carolina et al. (2021)). The combination of these oceanographic changes has led to a warmer continental shelf habitat.
The rapid warming in the northwest Atlantic is a major factor in the redistribution of marine species along the US east coast. Species have responded by adjusting the timing and locations of their seasonal migrations and shifting their geographic ranges (Nye et al. (2009); Staudinger et al. (2019)). There is evidence that warming has hampered fisheries recoveries as well. Adding a metabolic tax to physiological pathways like growth and metabolism. Species like Black Sea Bass, Atlantic shortfin squid, and Blue crab have been high-profile examples of species expanding their ranges to follow their thermal preferences. While species like the American lobster have shown declines at their southern range near Long Island Sound, with much doubt whether they will recover under the present temperature trends. The recent regime shift in the physical oceanography has also shown to be a catalyst for biological shifts as well (University of South Carolina et al. (2021); Perretti et al. (2017)).
While these examples show that species can respond to changes in the physical environment around them through movement & behavior, research elsewhere suggests that physiological responses integrated across species will manifest as changes in community size structure.
Fishing Impacts on Size/Species Composition
NEED fishing impacts on size structure references
Species Trends in the Northeast Atlantic Shelf
The continental shelf groundfish community in Northwest Atlantic has changed dramatically over the last century. Stocks that supported international fishing effort collapsed, and recovery efforts fell short of their objectives. Research on Georges Bank estimated that biomass more than halved in the 1960’s (pre-dating federal monitoring efforts), and noted a species replacement of commercial groundfish target species by skate and dogfish (Fogarty and Murawski (1998)). Industrial fishing is inherently size-selective, with larger individuals selectively removed from the population. This has an immediate impact on the community size-distribution with additional impacts on the future population as well. Larger individuals have a greater impact on population recovery, capable of holding more (and often of higher quality) eggs. Size-based harvest in fisheries has been shown to create selective pressures that promote characteristics of early maturation at smaller sizes add_citation.
reference gb spectra early work
Purpose
With the understanding that populations depend on the health of their ecosystems, there is a need to have community-wide metrics to effectively understand and manage marine resources. Size based indices are metrics that can be estimated from the information that has historically been available from long-term survey efforts. These indices have been shown to be sensitive to the impacts of fishing, but should also capture environmentally driven stress as well. We estimated size spectrum relationships as SBI’s for the groundfish populations for each sub-region of the Northeast US continental shelf. In the case of the NW Atlantic sustained increases in temperature should have a physiological impact on the community size structure.
This leads to our second hypothesis:
H2. Warming alters the community through the direct influence of temperature on metabolism, growth, and population productivity.
Methods
Fish Data Source and Processing
Data on the biomass, abundance, and size of fish on the Northeast U.S. Shelf were collected as part of the Northeast Fisheries Science Center’s bottom trawl survey ( (grosslein1969?), (Azarovits1981?), (politis2014?) ) This survey is conducted from Cape Hatteras, North Carolina to the Gulf of Maine each year in the spring and in the fall. The survey follows a stratified random sampling design, with strata defined based on depth, bottom habitat, and latitude. Trawls are performed for a fixed duration at each station, reporting aggregate abundance and biomass for all species caught, and measuring individual lengths and weights for the catch of each species or a sub-sample if that catch is large. Correction factors were applied to aggregate species abundance and biomass to account for changes in vessels, gear, and doors used in the survey over time ( (sissenwine1978?), (byrne1991?), Miller et al. (2010) ) However, abundance and biomass at length needed to be estimated after these aggregate corrections. As such, abundance at length for each species was adjusted to match the corrected aggregate species abundance at each station, such that for each species, the sum of the resulting estimated abundance numbers across each length is equal to the corrected aggregate abundance.
Community Composition & Functional Groups
Analyses were performed using 68 species. These species were selected based on the availability of published weight-at-length relationships (Wigley et al. 2003) and represented 98.98% of the total biomass caught in the survey. Each species was assigned to a functional group based on life history and geography using the definitions of (Hare et al. (2016)). Functional groups included coastal fish, diadromous fish, elasmobranch, groundfish, and pelagic fish (Table 1). Six species with available length-weight details did not have a functional group designation, these species were designated as reef species. Exploratory analyses showed that the pelagic species biomass was low in all regions, and is unlikely to be representative of true biomass trends due to gear selectivity.
Published length-weight relationships ( Wigley et al. 2003 ) were used to convert from length data, available for all individuals, into their corresponding biomass-at-length. To account for differences in sampling effort among survey strata, all corrected abundance-at-length data were area-stratified. Area-stratified biomass-at-length values were then computed as the product of area-stratified abundance-at-length and estimated weight-at-length. All analyses were performed using area-stratified abundances and their associated area-stratified biomass estimates, hereby referred to as simply abundance-at-length & biomass-at-length
Community Composition Metrics
Our analyses used all data collected during the spring and fall surveys from 1970-2019. Data were grouped using survey-design strata into four sub-regions: Gulf of Maine, Georges Bank, Southern New England, Mid-Atlantic Bight. These sub-regions have been widely used in regional ecological studies (e.g., ). For each region, we developed several time series indicators:
Community Composition Metrics (regional abundance and biomass by functional group)
Average length and weight of the aggregate community and each functional group
Exponent of the Individual Size Distribution Spectra
Body Size Changes
The average body length (cm) and body weight (kg) was calculated for all catch for each region and within each functional group. Each body-size metric was weighted by the abundance-at-length of a species sampled at each station such that.
\[X_j = \frac{ \sum_{}^{} n_iX_{ij} }{ \sum~w_i }\] Where \(X\) is the body-size metric, \(j\) is the year, & \(n\) is the abundance for each station \(i\).
Data for body size trends were not truncated using any minimum or maximum size and reflect all available catch data for the 68, for which biomass-at-length could be estimated.
Estimating Size Spectra
Lengths of individuals in the catch data are measured to the nearest cm, with smaller specimens measured to the nearest millimeter. Because individual biomass is estimated from those length measurements, there is a range of possible body mass values between the cm & mm increments. The relationship between length and mass in fishes is exponential and taxon specific, so biases resulting from using only the lower or upper end of those ranges is different for each taxon and increases for larger taxa. To account for this and reduce biases, we used the extended likelihood method (MLEbin) of Edwards et al. (2020). This method estimates the exponent of size spectra (b) for a bounded power law relationship between abundance and our length-derived estimates of individual biomass.
The exponent of this relationship is analogous to slope estimates from linear regression on logarithmic axes, with a steeper slope (i.e. a more negative b) indicating fewer large-bodied individuals and/or more small-bodied fishes Carvalho et al. (2021). This method has been shown to be the most accurate for estimating the exponent of size spectra when tested against alternative methods that require arbitrary decisions around binning the data (White et al. (2008); Sprules and Barth (2016); Edwards et al. (2017) ).
Using this method, the size spectra exponent (b) was estimated for each survey region from 1970 to 2019. A minimum biomass of 1g was used for the lower limit and a maximum biomass of 10kg was used as an upper limit for the ISD’s bounded power law probability density function Equation 1, where \(x\) is body mass & \(\lambda\) is the scaling exponent of the ISD. The biomass within these limits constitutes 97.83% of all estimated biomass for the 68 used in this study. This truncation accounts for poor gear selectivity at the smallest and largest size ranges and imposes shared limits to the size distribution we’d expect to sample across these different areas. Exponents of size spectra were calculated using code modified from the sizeSpectra package in R (Edwards et al. (2017); Edwards et al. (2020)).
Drivers of Size Spectrum Change
The impact of external factors on the changes in size spectra was correlated against several hypothesized driving forces related to both environmental regimes and anthropogenic disturbances. Potential large-scale environmental drivers include regional SST anomalies & the larger-scale impact of the Gulf Stream Index (GSI). Fishing pressure represents the primary top-down anthropogenic driver in the region, and was investigated using the aggregate regional commercial landings data. The potential bottom-up impacts of zooplankton community differences were incorporated at a regional scale using small and large zooplankton indices from the EcoMon plankton survey.
Gulf Stream Index
Gulf stream index (GSI) values were obtained from the ecodata package in R (Bastille & Hardison 2018) . This package supplies GSI data at monthly intervals following the methodology of Pérez-Hernández and Joyce (2014) and Joyce et al. (2019) , using as sea level height anomaly data from the Copernicus Marine Environment Monitoring Service
Sea Surface Temperature Data
Global Sea surface temperature data was obtained via NOAA’s optimally interpolated SST analysis (OISSTv2), providing daily temperature values at a 0.25° latitude x 0.25° longitude resolution (Reynolds et al. 2007). A daily climatology for every 0.25° pixel in the global data set was created using average daily temperatures spanning the period of 1982-2011. Daily anomalies were then computed as the difference between observed temperatures and the daily climatological average. OISSTv2 data used in these analyses were provided by the NOAA PSL, Boulder, Colorado, USA from their website at https://psl.noaa.gov/data/gridded/data.noaa.oisst.v2.html.
Sea surface temperature data was regionally averaged to match the survey regions from the age-at-length data. SST anomalies were averaged by year for each region and over the entire sampling region to produce daily time series. These time series were then processed into annual timeseries of surface temperatures and anomalies. All region-averaging was done with area-weighting of the latitude/longitude grid cells to account for differences in cell-size in the OISSTv2 data.
Commercial Fishing
Fishing pressure in the region was indexed using state and federal commercial fishing landings. These data were obtained from the Greater Atlantic Regional Fisheries Office (GARFO) for statistical areas that are routinely used for fisheries reporting and management (Figure X). Individual statistical areas were aggregated into regions that closely align with the survey areas we defined for the size spectra analyses (Figure X).
Zooplankton Abundance Indices
Zooplankton data are from the National Oceanographic and Atmospheric Administration Marine Resources Monitoring, Assessment and Prediction (MARMAP) program and Ecosystem Monitoring (EcoMon) cruises detailed extensively in Kane (2007), Kane (2011), and Morse et al. (2017). Abundance anomalies are computed from the expected abundance on the day of sample collection. The small copepod index is computed by averaging the individual abundance anomalies of Pseudocalanus spp., Centropages hamatus, Centropages typicus, and Temora longicornis. The large-copepod anomalies is the abundance anomaly of Calanus finmarchicus, the largest copepod in the Northeast U.S. region. Anomalies for both large and small zooplankton groups are then averaged within ecological production units for annual timeseries.
Analysis of Driver Impacts
Driver Cross Correlation Functions
In addition to evaluating the direct and contemporaneous impacts of each potential driver, additional predictors were identified for inclusion in the model-selection process through the use of cross correlation function (CCF) estimates and changepoint analysis. CCF estimates were used to look at correlation at one-year lag intervals to investigate potential time-delayed effects between the hypothesized drivers and the community’s size spectra. CCF estimates were performed between the dependent variables of the size spectra slope with the independent variables of the annual gulf stream index, the corresponding regional commercial landings and sea surface temperature anomalies, and both small and large zooplankton indices. Annual time lags on independent variables of up to 10 years that were identified in the CCF analysis were then included as additional candidate predictors in the regression analysis on drivers of the spectrum slope changes.
Slope Changepoint Analyses
Changepoint analysis was used as a means to explore the temporal structure in changes of the regional size spectrum slopes. Changepoint analysis was performed using the envcpt package ((killick?) et al), testing comparing support for 12 candidate model structures and any identified changepoints. The most supported changepoint structures were then used to inform whether (if any) regional spectra expressed distinct periods of behavior. If changepoints were identified, this was used as evidence that there may have been a community regime change. Following this thinking it is plausible for there to be inconsistencies or non-stationarity in the predictors’ impact on the community spectra, as different community regimes may be differently vulnerable to top-down pressures like commercial fishing. To allow for changes in these relationships across different regimes, separated by changepoints, an additional interaction term was added to the regression model candidates for that region.
Driver Correlation Periods
Regression Analyses
Following exploratory analyses with CCF functions and changepoint analyses, regression models were developed and tested independently for each region. Potential drivers included in the models were the annual Gulf Stream Index, the annual regionally averaged SST anomaly value, regionally averaged small and large copepod indices, and the scaled annual commercial landings from that region. Gulf stream index, zooplankton indices, and SST anomalies were not standardized prior to use in model fitting and evaluation as the two former are already an index value and the latter is scaled to a thirty-year climatological reference period. In the event that CCF analysis identified an important time lag for a region’s independent variables, these specific time-lagged predictors were included as predictors in the candidate models, otherwise time-lags on predictors were not evaluated. For any region(s) that showed support for changepoints in their temporal structure, the periods separated by those breaks were included as an interaction term on all other predictors. In the absence of any identified changepoints, no multi-year periods were included as potential regime impacts. All candidate models for a region were evaluated using AIC to determine the most parsimonious models for each regional system.
Results
Community Abundance
Stratified abundance estimates have been gradually increasing on the Northeast Shelf since 1970. Abundance growth accelerated after 2007 to peak levels in 2014. Abundance then began declining, which continued through 2019 (Figure 2). In the Gulf of Maine, fish abundance remained relatively-low until the 1990’s, at which point it began to steadily rise–closely aligning with the pattern for the Northeast Shelf. This increase in abundance reversed briefly during 2002-2006, but continued to rise after 2006 before hitting a regional peak in 2016. Populations then declined through 2019 similar to the shelf-wide trend. Georges Bank abundances were consistently low from 1970 around 2010. By 2014 abundance had roughly quadrupled, propelled by strong recruitment classes of haddock. After 2014, abundance quickly fell to numbers more similar to the 1970-2010 levels by the end of the decade. Community abundance in Southern New England displayed higher inter-annual changes across all years compared to both Gulf of Maine and Georges Bank. Abundance in this area showed a less dramatic rise and fall than the Northern regions. Abundances began increasing rapidly here in 2007, before falling back to earlier levels by the end of the 2010’s decade. The Mid Atlantic Bight displayed the most inter-annual variability and had no major trends in fish abundance.
Groundfish species were the dominant functional group driving the abundance and biomass trends in the Gulf of Maine and Georges Bank, with the two southern regions showing a more balanced abundance distribution among the five functional groups (Figure 2).
Community Biomass
Similar to abundance, the overall biomass was highest in the two northern regions, the Gulf of Maine and Georges Bank. Roughly half of the biomass sampled in these regions can be attributed to groundfish species, with the second largest contributions coming from elasmobranchs. Within the groundfish biomass, larger individuals >2kg in particular, declined during the 70’s and 80’s in these regions, never truly recovering. Beginning in the 2000’s there were signs that groundfish abundances were increasing as evidenced by increasing numbers of smaller individuals, however in both regions this trend appears to have reversed by the mid 2010’s. Elasmobranch biomass increased steadily throughout the survey time period across all regions, with the exception of southern New England. This area showed large 5-10 year swings in biomass, but no clear long-term trend. Larger elasmobranch were rare in all regions except for a period spanning the late 70’s through the early 90’s isolated to Georges Bank. Demersal species biomass was highest in the Gulf of Maine, dwarfing their contributions in other regions. Their biomass declined in the 70’s, was flat until the late 90’s, remaining relatively high until declining in the late 2010’s. Pelagic species biomass was low in all regions, and is unlikely to be representative of true biomass trends due to gear selectivity.
Regional Variation in Species Composition
There was a distinct difference between Northern and Southern regions in the way biomass was distributed among the different functional groups. The largest contributors to biomass in the southern regions (southern New England & mid-Atlantic bight) was the elasmobranch community. While the northern regions (Gulf of Maine & Georges Bank) each had similar quantities of elasmobranch biomasses, there was also a comparable contribution of groundfish and in the Gulf of Maine there was a major component of demersal species as well.
Body Size Trends
The average fish size in the Gulf of Maine (length and weight) declined the greatest of all regions over our study period. The average individual length was greatest in the 1970’s in the 35-40cm range, falling to 28-33cm over the last decade. Body-weight fell dramatically in the 1980’s, from around .75kg in the 1970’s to .25-.30kg, roughly a third of what it had been. Georges Bank body sizes also declined during the study period, but less dramatically. Both of these Northern regions had brief period in the late 2000’s where average length and weight rose, before falling again in the 2010’s. The MAB region was the only region to see a long-term increase in both length and weight during the study period. SNE saw no long-term change in length, and a minor decline in average body-weight.
At the start of our time series, back in the 1970’s, there was a clear difference in the relative positions of spectra parameters among the different regions. Gulf of Maine and Georges Bank showed the least steep spectra slopes in the earlier time periods with slopes around -1 & -1.1 respectively. The relatively flat slopes in these regions both steepened over time, settling near -1.3 (GoM) and -1.5 (GB). Gulf of Maine experienced much of its decline during the 1980’s and 1990’s. There was a brief reversal in this trend during the 2000’s, but slopes continued to steepen by 2010 and remained steep through 2019. Georges Bank did not experience as rapid of a decline, but experienced a similar long-term steepening. In contrast to the northern regions, SNE and MAB had steeper slopes in the -1.2 to -1.5 territory. The long term pattern for SNE was one of increasing volatility, but not so much a decline. The spectra slope for the MAB was less volatile, but similarly maintained a relatively stable wander around -1.4. By the end of the study period all regions had slopes that were at or near a similar level.
Spectra Slope changepoints and Regression Analysis
Following break point exploratory analyses with the EnvCpt package, the Gulf of Maine was the only region from which changepoints were identified in the best supported model from the candidate changepoint model structures. Changes in Gulf of Maine’s spectrum slope was best modeled by a model allowing breaks in the trend, and with an auto-regressive process on the previous 2 years. From this model, breaks were identified at 1999 and 2007. Georges Bank was best modeled with a single trend, with no changepoints. Southern New-England showed the most support for a single intercept model, with no changepoints, with values centered on mu = -1.04 & sd = 0.046. The Mid-Atlantic Bight was best supported by a single trend model, with no changepoints, and an increasing intercept through the study period.
Region
Most Supported Model
gom
trendar2cpt
gb
trend
sne
meancpt
mab
trend
Driver Differences Across Regimes
changepoint analyses suggest there were as many as 3 distinct regimes in the Gulf of Maine slope changes. For all other regions there was not strong support for changepoints.
Exploratory analyses for Georges Bank indicated that the structure of the data did not show support for clear changepoints. CCF results of individual predictors showed that spectrum slope was correlated with small zooplankton index values from two years prior. This relationship was added to the standard predictor candidates, each operating at a simultaneous time step.
The following panels show the historical changes in each of the drivers. Landings have been scaled by average total landings within each region across all years. SST and GSI have not been scaled. This is different from how they are implemented in the regression analyses, when the landings were scaled over the 1982-2019 period.
Table 1: Common and scientific names for the species that constitute each functional group used in our analyses. X markers are used to indicate which regions each species has been caught in the data.
Functional Group Assignments and Regional Presence/Absence
Common Name
Scientific Name
Gulf of Maine
Georges Bank
Southern New England
Mid-Atlantic Bight
Coastal - (18)
Atlantic Croaker
micropogonias undulatus
X
X
X
Atlantic Spadefish
chaetodipterus faber
X
Atlantic Thread Herring
opisthonema oglinum
X
X
Black Sea Bass
centropristis striata
X
X
X
X
Blackbelly Rosefish
helicolenus dactylopterus
X
X
X
X
Blueback Herring
alosa aestivalis
X
X
X
X
Bluefish
pomatomus saltatrix
X
X
X
X
Butterfish
peprilus triacanthus
X
X
X
X
Cunner
tautogolabrus adspersus
X
X
X
X
Greater Amberjack
seriola dumerili
X
X
Northern Kingfish
menticirrhus saxatilis
X
X
X
Scup
stenotomus caprinus
X
X
X
X
Southern Kingfish
menticirrhus americanus
X
Spanish Mackerel
scomberomorus maculatus
X
Spanish Sardine
sardinella aurita
X
Spot
leiostomus xanthurus
X
X
Striped Bass
morone saxatilis
X
X
X
X
Weakfish
cynoscion regalis
X
X
X
Diadromous - (2)
American Shad
alosa sapidissima
X
X
X
X
Atlantic Sturgeon
acipenser oxyrhynchus
X
Elasmobranch - (19)
Atlantic Angel Shark
squatina dumeril
X
Atlantic Sharpnose Shark
rhizoprionodon terraenovae
X
Barndoor Skate
dipturus laevis
X
X
X
X
Bullnose Ray
myliobatis freminvillei
X
X
Chain Dogfish
scyliorhinus retifer
X
X
X
Clearnose Skate
raja eglanteria
X
X
Cownose Ray
rhinoptera bonasus
X
Little Skate
leucoraja erinacea
X
X
X
X
Rosette Skate
leucoraja garmani
X
X
X
X
Roughtail Stingray
dasyatis centroura
X
Sand Tiger
carcharias taurus
X
Sandbar Shark
carcharhinus plumbeus
X
X
Smooth Butterfly Ray
gymnura micrura
X
Smooth Dogfish
mustelus canis
X
X
X
X
Smooth Skate
malacoraja senta
X
X
X
X
Spiny Butterfly Ray
gymnura altavela
X
Spiny Dogfish
squalus acanthias
X
X
X
X
Thorny Skate
amblyraja radiata
X
X
X
X
Winter Skate
leucoraja ocellata
X
X
X
X
Groundfish - (25)
Acadian Redfish
sebastes fasciatus
X
X
X
X
American Plaice
hippoglossoides platessoides
X
X
X
X
Atlantic Cod
gadus morhua
X
X
X
X
Atlantic Halibut
hippoglossus hippoglossus
X
X
X
Atlantic Wolffish
anarhichas lupus
X
X
X
Cusk
brosme brosme
X
X
X
X
Fawn Cusk-Eel
lepophidium profundorum
X
X
X
X
Fourspot Flounder
paralichthys oblongus
X
X
X
X
Goosefish
lophius americanus
X
X
X
X
Haddock
melanogrammus aeglefinus
X
X
X
X
Longhorn Sculpin
myoxocephalus octodecemspinosus
X
X
X
X
Northern Searobin
prionotus carolinus
X
X
X
X
Ocean Pout
macrozoarces americanus
X
X
X
X
Offshore Hake
merluccius albidus
X
X
X
X
Pollock
pollachius virens
X
X
X
X
Red Hake
urophycis chuss
X
X
X
X
Sea Raven
hemitripterus americanus
X
X
X
X
Silver Hake
merluccius bilinearis
X
X
X
X
Spotted Hake
urophycis regia
X
X
X
X
Summer Flounder
paralichthys dentatus
X
X
X
X
White Hake
urophycis tenuis
X
X
X
X
Windowpane Flounder
scophthalmus aquosus
X
X
X
X
Winter Flounder
pseudopleuronectes americanus
X
X
X
X
Witch Flounder
glyptocephalus cynoglossus
X
X
X
X
Yellowtail Flounder
limanda ferruginea
X
X
X
X
Pelagic - (4)
Atlantic Herring
clupea harengus
X
X
X
X
Atlantic Mackerel
scomber scombrus
X
X
X
X
Buckler Dory
zenopsis conchifera
X
X
X
X
Round Herring
etrumeus teres
X
X
X
X
Functional group assignments adapted from Hare et al. 2010
Top Commercial Fisheries Landings of Northeastern US (by weight)
Avg. Annual Landings (lb.)
Total Landings (lb.)
Total Value ($)
Gulf of Maine - 1960
Hake, Silver
16.58M
281.87M
8.71M
Herring, Atlantic
11.57M
138.83M
2.50M
Redfish, Acadian
2.12M
88.97M
3.41M
Gulf of Maine - 1970
Herring, Atlantic
22.78M
501.08M
19.70M
Menhaden, Atlantic
17.78M
373.48M
7.87M
Redfish, Acadian
3.14M
219.85M
23.87M
Gulf of Maine - 1980
Herring, Atlantic
21.78M
653.26M
34.52M
Menhaden, Atlantic
21.24M
509.75M
12.77M
Pollock
3.33M
229.57M
62.00M
Gulf of Maine - 1990
Herring, Atlantic
25.21M
958.12M
54.12M
Cod, Atlantic
2.35M
138.76M
131.76M
Shark, Dogfish, Spiny
3.34M
120.17M
15.95M
Gulf of Maine - 2000
Herring, Atlantic
2.99M
47.77M
4.31M
Monkfish/Angler/Goosefish
716.21K
31.51M
51.13M
Cod, Atlantic
692.95K
30.49M
42.30M
Gulf of Maine - 2010
Tuna, Bluefin
209.06K
3.76M
33.30M
Shark, Dogfish, Spiny
479.11K
2.87M
590.62K
Pollock
188.20K
1.69M
2.08M
Georges Bank - 1960
Haddock
15.00M
270.06M
34.41M
Hake, Silver
6.83M
95.57M
3.19M
Cod, Atlantic
4.88M
87.89M
8.12M
Georges Bank - 1970
Cod, Atlantic
7.78M
233.48M
59.16M
Flounder, Yellowtail
4.62M
138.52M
43.16M
Redfish, Acadian
2.63M
76.37M
9.09M
Georges Bank - 1980
Cod, Atlantic
10.11M
404.40M
211.60M
Flounder, Winter
2.50M
100.11M
89.84M
Haddock
2.36M
94.27M
66.68M
Georges Bank - 1990
Cod, Atlantic
4.27M
192.29M
190.26M
Hake, Silver
1.79M
76.82M
20.49M
Flounder, Winter
1.23M
56.43M
75.59M
Georges Bank - 2000
Cod, Atlantic
2.17M
62.91M
75.20M
Herring, Atlantic
3.49M
48.92M
3.73M
Haddock
1.54M
43.01M
55.37M
Georges Bank - 2010
Hake, Silver
155.88K
779.40K
499.90K
Haddock
39.65K
118.95K
143.04K
Flounder, Winter
40.40K
80.80K
216.28K
Southern New England - 1960
Other Fish, Bony
14.84M
400.77M
3.73M
Flounder, Yellowtail
6.56M
196.83M
19.12M
Flounder, Winter
2.52M
70.58M
7.01M
Southern New England - 1970
Menhaden, Atlantic
9.99M
239.84M
5.12M
Other Fish, Bony
4.05M
206.59M
2.49M
Flounder, Yellowtail
2.07M
153.55M
36.47M
Southern New England - 1980
Menhaden, Atlantic
6.60M
217.68M
10.21M
Hake, Silver
2.56M
205.02M
46.11M
Flounder, Yellowtail
1.66M
132.92M
83.38M
Southern New England - 1990
Hake, Silver
2.52M
196.81M
78.54M
Herring, Atlantic
2.12M
129.02M
7.19M
Menhaden, Atlantic
3.71M
125.98M
8.69M
Southern New England - 2000
Mackerel, Atlantic
2.55M
135.06M
15.60M
Hake, Silver
1.00M
55.25M
26.89M
Skate, Nk
950.56K
49.43M
6.53M
Southern New England - 2010
Scup
161.10K
6.44M
4.29M
Hake, Silver
145.07K
4.21M
3.12M
Flounder, Summer
80.43K
3.86M
11.54M
Mid-Atlantic Bight - 1960
Flounder, Summer
2.03K
4.05K
720.00
Flounder, Yellowtail
2.33K
2.33K
214.00
Flounder, Witch
395.00
395.00
36.00
Mid-Atlantic Bight - 1970
Menhaden, Atlantic
10.20M
50.98M
1.59M
Weakfish/Sea Trout, Squeteague
886.91K
9.76M
1.40M
Scup
876.60K
8.77M
2.09M
Mid-Atlantic Bight - 1980
Menhaden, Atlantic
30.78M
646.41M
10.94M
Flounder, Summer
1.15M
83.83M
72.00M
Scup
550.89K
37.46M
15.53M
Mid-Atlantic Bight - 1990
Menhaden, Atlantic
115.86M
4.63B
286.14M
Mackerel, Atlantic
1.67M
103.62M
13.87M
Croaker, Atlantic
1.35M
71.65M
22.53M
Mid-Atlantic Bight - 2000
Menhaden, Atlantic
69.60M
2.64B
167.17M
Croaker, Atlantic
2.16M
106.02M
42.93M
Mackerel, Atlantic
1.70M
59.41M
6.38M
Mid-Atlantic Bight - 2010
Menhaden, Atlantic
118.29M
1.89B
154.46M
Bass, Striped
1.70M
25.56M
75.05M
Croaker, Atlantic
1.08M
24.81M
21.37M
Landings data obtained from the Greater Atlantic Regional Fishing Office (GARFO)
Bianchi, G., Gislason, H., Graham, K., Hill, L., Jin, X., Koranteng, K., Manickchand-Heileman, S., et al. 2000. Impact of fishing on size composition and diversity of demersal fish communities. ICES Journal of Marine Science, 57: 558–571. https://doi.org/10.1006/jmsc.2000.0727.
Boudreau, P. R., and Dickie, L. M. 1992. Biomass spectra of aquatic ecosystems in relation to fisheries yield. Canadian Journal of Fisheries and Aquatic Sciences, 49: 1528–1538. https://cdnsciencepub.com/doi/abs/10.1139/f92-169.
Brown, J., West, G. B., and Enquist, B. 2000. Scaling in biology: Patterns and processes, causes and consequences. In pp. 1–24.
Carvalho, P. G., Setiawan, F., Fahlevy, K., Subhan, B., Madduppa, H., Zhu, G., and Humphries, A. T. 2021. Fishing and habitat condition differentially affect size spectra slopes of coral reef fishes. Ecological Applications, 31: e02345. https://onlinelibrary.wiley.com/doi/abs/10.1002/eap.2345.
Daan, N., Gislason, H., G. Pope, J., and C. Rice, J. 2005. Changes in the North Sea fish community: evidence of indirect effects of fishing? ICES Journal of Marine Science, 62: 177–188. https://academic.oup.com/icesjms/article/62/2/177/602590.
Edwards, A., Robinson, J., Blanchard, J., Baum, J., and Plank, M. 2020. Accounting for the bin structure of data removes bias when fitting size spectra. Marine Ecology Progress Series, 636: 19–33. https://www.int-res.com/abstracts/meps/v636/p19-33/.
Gangopadhyay, A., Gawarkiewicz, G., Silva, E. N. S., Monim, M., and Clark, J. 2019. An Observed Regime Shift in the Formation of Warm Core Rings from the Gulf Stream. Scientific Reports, 9: 12319. http://www.nature.com/articles/s41598-019-48661-9.
Gillooly, J. F., Brown, J. H., West, G. B., Savage, V. M., and Charnov, E. L. 2001. Effects of Size and Temperature on Metabolic Rate. Science, 293: 2248–2251. https://www.science.org/doi/10.1126/science.1061967.
Hare, J. A., Morrison, W. E., Nelson, M. W., Stachura, M. M., Teeters, E. J., Griffis, R. B., Alexander, M. A., et al. 2016. A Vulnerability Assessment of Fish and Invertebrates to Climate Change on the Northeast U.S. Continental Shelf. PLOS ONE, 11: e0146756. https://dx.plos.org/10.1371/journal.pone.0146756.
Heatwole, H., Torres, F., Blasini De Austin, S., and Heatwole, A. 1969. Studies on anuran water balanceI. Dynamics of evaporative water loss by the coquí, eleutherodactylus portoricensis. Comparative Biochemistry and Physiology, 28: 245–269. https://www.sciencedirect.com/science/article/pii/0010406X69913425.
Hillaert, J., Hovestadt, T., Vandegehuchte, M. L., and Bonte, D. 2018. Size-dependent movement explains why bigger is better in fragmented landscapes. Ecology and Evolution, 8: 10754–10767. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6262741/.
Joyce, T. M., Kwon, Y.-O., Seo, H., and Ummenhofer, C. C. 2019. Meridional Gulf Stream Shifts Can Influence Wintertime Variability in the North Atlantic Storm Track and Greenland Blocking. Geophysical Research Letters, 46: 1702–1708. https://onlinelibrary.wiley.com/doi/abs/10.1029/2018GL081087.
Kerr, S. R., and Dickie, L. M. 2001. The Biomass Spectrum: A Predator-Prey Theory of Aquatic Production. Columbia University Press.
Kleisner, K. M., Fogarty, M. J., McGee, S., Hare, J. A., Moret, S., Perretti, C. T., and Saba, V. S. 2017. Marine species distribution shifts on the U.S. Northeast Continental Shelf under continued ocean warming. Progress in Oceanography, 153: 24–36. https://linkinghub.elsevier.com/retrieve/pii/S0079661116301896.
Miller, T. J., O’Brien, L., and Fratantoni, P. S. 2018. Temporal and environmental variation in growth and maturity and effects on management reference points of Georges Bank Atlantic cod. Canadian Journal of Fisheries and Aquatic Sciences, 75: 2159–2171. http://www.nrcresearchpress.com/doi/10.1139/cjfas-2017-0124.
Nye, J., Link, J., Hare, J., and Overholtz, W. 2009. Changing spatial distribution of fish stocks in relation to climate and population size on the Northeast United States continental shelf. Marine Ecology Progress Series, 393: 111–129. http://www.int-res.com/abstracts/meps/v393/p111-129/.
Perretti, C., Fogarty, M., Friedland, K., Hare, J., Lucey, S., McBride, R., Miller, T., et al. 2017. Regime shifts in fish recruitment on the Northeast US Continental Shelf. Marine Ecology Progress Series, 574: 1–11. http://www.int-res.com/abstracts/meps/v574/p1-11/.
Pershing, A. J., Alexander, M. A., Hernandez, C. M., Kerr, L. A., Le Bris, A., Mills, K. E., Nye, J. A., et al. 2015. Slow adaptation in the face of rapid warming leads to collapse of the gulf of maine cod fishery. Science, 350: 809–812. https://www.science.org/doi/full/10.1126/science.aac9819.
Pomeranz, J. P. F., Junker, J. R., and Wesner, J. S. 2022. Individual size distributions across North American streams vary with local temperature. Global Change Biology, 28: 848–858. https://onlinelibrary.wiley.com/doi/10.1111/gcb.15862.
Rossberg, A. G. 2012. 6 - A Complete Analytic Theory for Structure and Dynamics of Populations and Communities Spanning Wide Ranges in Body Size. In pp. 427–521. Ed. by U. Jacob and G. Woodward. Academic Press. https://www.sciencedirect.com/science/article/pii/B9780123969927000083.
Shin, Y.-J., Rochet, M.-J., Jennings, S., Field, J. G., and Gislason, H. 2005. Using size-based indicators to evaluate the ecosystem effects of fishing. ICES Journal of Marine Science, 62: 384–396. https://doi.org/10.1016/j.icesjms.2005.01.004.
Sprules, W. G., and Barth, L. E. 2016. Surfing the biomass size spectrum: some remarks on history, theory, and application. Canadian Journal of Fisheries and Aquatic Sciences, 73: 477–495. http://www.nrcresearchpress.com/doi/10.1139/cjfas-2015-0115.
Staudinger, M. D., Mills, K. E., Stamieszkin, K., Record, N. R., Hudak, C. A., Allyn, A., Diamond, A., et al. 2019. It’s about time: A synthesis of changing phenology in the Gulf of Maine ecosystem. Fisheries Oceanography, 28: 532–566. https://onlinelibrary.wiley.com/doi/abs/10.1111/fog.12429.
---title: "Community Size-Structure of the Northwest Atlantic Groundfish Communities, Response to Direct Disturbance and a Changing Environment"author: name: "Adam A. Kemberling" title: "Quantitative Research Associate" url: https://github.com/adamkemberling affiliation: Gulf of Maine Research Institutedescription: | Results from a Individual Size Distribution Analysis of the Northeast US Groundfish Communitydate: "`r Sys.Date()`"# format: docxformat: html: self-contained: true code-fold: true code-tools: true df-print: kable toc: true toc-depth: 2editor: sourceexecute: echo: false warning: false message: false fig.height: 6 fig.width: 7 fig.align: center comment: ""bibliography: references.bibcsl: ices-journal-of-marine-science.csl---```{r}#| label: load packages and functions#### Packages ####library(car)library(targets)library(rnaturalearth)library(here)library(sf)library(gmRi)library(patchwork)library(gt)library(knitr)library(tidyverse)library(ggstream)library(broom)library(bcp)library(scales)library(corrplot)library(readxl)library(Hmisc)library(MuMIn)library(gtsummary)library(broom.helpers)# changepoint packagelibrary(EnvCpt)# Package Conflictsconflicted::conflict_prefer("select", "dplyr")conflicted::conflict_prefer("filter", "dplyr")# Support functionssource(here("R/support/sizeSpectra_support.R"))# Resource Pathres_path <- gmRi::cs_path("res")# GGplot themetheme_set(theme_gmri() +theme(panel.grid.major =element_line(linewidth =0.5, linetype =1, color ="gray"),panel.grid.major.x =element_line(linewidth =0.5, linetype =3, color ="gray"),panel.grid.minor =element_line(linewidth =0.5, linetype =3, color ="gray90"),axis.line =element_line(color ="black"),axis.line.y =element_line(color ="black"), strip.background =element_rect(colour =gmri_cols("teal")),legend.position ="bottom"))# Map polygonsus_poly <-ne_states("united states of america", returnclass ="sf")canada <-ne_states("canada", returnclass ="sf")# levels for faceting areasarea_levels <-c("Northeast Shelf, Full", "GoM", "GB", "SNE", "MAB")area_levels_long <-c("Northeast Shelf, Full", "Gulf of Maine", "Georges Bank", "Southern New England", "Mid-Atlantic Bight")# table to join for swapping shorthand for long-hand namesarea_df <-data.frame(area =c("Scotian Shelf", "Gulf of Maine", "Georges Bank", "Southern New England", "Mid-Atlantic Bight", "All"),survey_area =c("SS", "GoM", "GB", "SNE", "MAB", "Northeast Shelf, Full"),area_titles =c("Scotian Shelf", "Gulf of Maine", "Georges Bank", "Southern New England", "Mid-Atlantic Bight", "Northeast Shelf, Full"))``````{r}#| label: functions-cell# Function to process summaries for various factor combinationsget_group_summaries <-function(data_in, ...){# Do some grouping to get totals group_totals <- data_in %>%group_by(...) %>%summarise(total_survey_catch =sum(numlen, na.rm = T),total_lw_bio =sum(sum_weight_kg, na.rm = T),total_strat_abund =sum(strat_total_abund_s, na.rm = T),total_strat_lw_bio =sum(strat_total_lwbio_s, na.rm = T), .groups ="drop") # length bins group_lengths <- data_in %>%group_by(..., length_bin) %>%summarise(lenbin_survey_catch =sum(numlen),lenbin_lw_bio =sum(sum_weight_kg),lenbin_strat_abund =sum(strat_total_abund_s),lenbin_strat_lw_bio =sum(strat_total_lwbio_s), .groups ="drop") %>%left_join(group_totals) %>%mutate(perc_total_catch = (lenbin_survey_catch - total_survey_catch) *100,perc_lw_bio = (lenbin_lw_bio - total_lw_bio) *100,perc_strat_abund = (lenbin_strat_abund - total_strat_abund) *100,perc_strat_lw_bio = (lenbin_strat_lw_bio - total_strat_lw_bio) *100)# weight bins group_weights <- data_in %>%group_by(..., weight_bin) %>%summarise(wtbin_survey_catch =sum(numlen),wtbin_lw_bio =sum(sum_weight_kg),wtbin_strat_abund =sum(strat_total_abund_s),wtbin_strat_lw_bio =sum(strat_total_lwbio_s),.groups ="drop") %>%left_join(group_totals) %>%mutate(perc_total_catch = (wtbin_survey_catch / total_survey_catch) *100,perc_lw_bio = (wtbin_lw_bio / total_lw_bio) *100,perc_strat_abund = (wtbin_strat_abund / total_strat_abund) *100,perc_strat_lw_bio = (wtbin_strat_lw_bio / total_strat_lw_bio) *100)return(list("length_bins"=drop_na(group_lengths),"weight_bins"=drop_na(group_weights)))}# Function to process summaries for various factor combinationslog2_bin_summaries <-function(data_in, ...){# Do some grouping to get totals group_totals <- data_in %>%group_by(...) %>%summarise(total_survey_catch =sum(numlen, na.rm = T),total_lw_bio =sum(sum_weight_kg, na.rm = T),total_strat_abund =sum(strat_total_abund_s, na.rm = T),total_strat_lw_bio =sum(strat_total_lwbio_s, na.rm = T), .groups ="drop") # weight bins group_weights <- data_in %>%mutate(weight_bin = bin_label) %>%group_by(..., left_lim) %>%summarise(wtbin_survey_catch =sum(numlen),wtbin_lw_bio =sum(sum_weight_kg),wtbin_strat_abund =sum(strat_total_abund_s),wtbin_strat_lw_bio =sum(strat_total_lwbio_s),.groups ="drop") %>%left_join(group_totals) %>%mutate(perc_total_catch = (wtbin_survey_catch / total_survey_catch) *100,perc_lw_bio = (wtbin_lw_bio / total_lw_bio) *100,perc_strat_abund = (wtbin_strat_abund / total_strat_abund) *100,perc_strat_lw_bio = (wtbin_strat_lw_bio / total_strat_lw_bio) *100)return(list("weight_bins"=drop_na(group_weights)))}# Function to grab the correlation data and lag dataget_ccf_vector <-function(x,y){# Run the ccf ccf_data <-ccf(x,y,plot= F , lag.max =15)# Get the signif:# https://www.squaregoldfish.co.uk/programming/r_acf_significance.md/# Not 100% sure n is the same for ccf as it is for acf, but... significance_level <-qnorm((1+0.95)/2)/sqrt(sum(!is.na(x)))data.frame("acf"= ccf_data$acf,"lag"= ccf_data$lag,"sigpos"= significance_level,"signeg"= significance_level*-1 )}# Plotting the correlation matrix over a specific decadeplot_period_correlations <-function(x, label) { x %>%filter(response_short !="Spectra Slope", response_area !="All") %>%ggplot(aes(var1, response_short, fill = r2)) +geom_tile() +geom_text(aes(label = signif)) +facet_grid(response_area ~ driver_type, scales ="free_x", space ="free") +scale_fill_distiller(palette ="RdBu", limits =c(-1, 1), breaks =c(-1,0,1)) +coord_cartesian(clip ="off") +theme(axis.text.x =element_text(angle =45, hjust =1, vjust =1),axis.text.y =element_blank(),strip.text.y =element_text(angle =0),legend.position =c(1.205, -.25),legend.title =element_text(vjust =0.75),plot.caption =element_text(hjust =0)) +guides(fill =guide_colorbar(title.position ="top", title.hjust =0.5)) +labs(x ="Correlated Feature", y ="Size Spectrum Slope",fill ="Correlation Coefficient",title =str_c(label, " | Correlation of Size Spectra to Hypothesized Drivers"),subtitle ="Correlation coefficients shown display same-year correlations.",caption =md("* Symbol signals a significant correlation at an alpha = 0.05"))}```### Potential Journals:1. **ICES Journal of Marine Science**# AbstractThe surface waters of the Northwest Atlantic Ocean are among the fastest warming on Earth. This area is highly-productive biologically, and there are concerns that ecological consequences will follow this rapid-warming. Research on the impacts of this rapid warming has primarily focused on high-profile and/or upper trophic level species. Ecological theory and laboratory studies suggest that elevated temperatures facilitate early maturation and smaller adult body-sizes. However, it is unclear whether that relationship might be mitigated against through adaptive behaviors in an open ocean environment. Here we've investigated ecosystem wide impacts on the individual size distribution (ISD) to track changes in community size structure. In cases where community responses are not adequate to counter the impacts of elevated temperatures, we anticipated a steepening of the size spectrum slope (ISD exponent). A steeper relationship relating to a reduction in larger sized individuals and an increased prevalence of smaller sized individuals. Using data from fisheries independent surveys we calculated the community size spectra for four regions along the US NE continental shelf. Correlation/regression analyses were then performed to then assess the degree to which these changes were in alignment to hypothesized bottom-up and top-down disturbances. At the regional scale, we found that community size structure changes (spectra slope) were the largest in the Northern regions, in the Gulf of Maine and Georges Bank. These areas are home to the coldest temperatures and the largest proportions of groundfish species in the community. Spectrum slope declines were most pronounced in the 80's and 90's, before the rapid warming of the last decade. The timing of these declines suggest that external factors drove the initial declines of larger-sized individuals within the communities, before elevated temperatures began to influence the ecosystem. Correlation analyses reveal that while fisheries landings are strongly correlated with these declines, bottom-up factors of zooplankton community metrics, Gulf Stream Index, and SST anomalies are also important. While the primary pressure of fisheries exploitation has declined dramatically over time, the recovery of larger-sized individuals has not been seen. That kind of recovery will likely depend on the elevated temperatures seen over the last decade.# Introduction## Normalized Size Spectra & Individual Size DistributionsSize is a defining characteristic of species and mediates many ecological interactions and metabolic pathways (@brown2000) . Size is a big factor in determining the mobility of an organism. Mobility then impacts the ability to evade predation, foraging success, efforts to locate and follow essential habitats, geographic ranging, and the metabolic costs associated with each of these behaviors (@hillaert2018). Body size also mediates vulnerability to aspects the immediate environment such as temperature through heat exchange or the threat of desiccation in terrestrial species (@gillooly2001; @heatwole1969). Body size even informs life history features like life span and the trophic position an individual might occupy through its impact on metabolism and resource use (@white2007).Size structured environments are a fundamental organizational pattern globally that emerges from these relationships **add_citation**. Within strongly size-structured ecosystems, growth and maturity changes alter fitness and ultimately determine whether a species is successful in that environment **add_citation** . Ecological theory is rich with models relating how energy transfers from smaller prey species to larger predatory trophic levels, the allocation of energy for growth, and the trade offs of allocating energy towards those ends (@bertalanffy1938; @vonbertalanffy1957; **add_citations)**. A globally persistent pattern in ecology entangled in those relationships and their critiques is the decline in abundance with increasing body size (@damuth1981; @currie1993; @sheldon1972; @loeuille2006).This relationship, between size and abundance, integrates multiple processes operating on the cellular, individual, and community levels simultaneously. The quantities for size and abundance are also some of the most readily collected data assets of any ecological community. This creates an opportunity to learn much about a system from a relatively low-effort in data collection. For these reasons, size spectrum analyses and individual size distribution (ISD) methods have gained increasing attention as an entry point to assessing ecosystem health and to detect system-wide disturbance (@shin2005'; @pomeranz2022; @white2007). An advantage of these models is they avoid the need to explicitly articulate each predator-prey interactions as they and can be estimated from the commonly collected measures of abundance and size. The "size spectrum" describes the distribution of biomass or abundance as a function of individuals' mass or size on a log--log scale (@guiet2016; @kerr2001) . Size spectra are described by two terms, the size spectrum slope & intercept. These two terms reveal a sense of the baseline productivity, and how energy flows through an ecosystem (in the form of biomass) from many smaller individuals to many fewer large individuals. The spectrum intercept has been linked to the productivity of the community, and is often connected to the prevailing environmental conditions (@boudreau1992; @rossberg2012).**How are they used in practice:**Size spectra condense the complexities of predator prey networks and their interactions into a handful of size-based indices. These indices capture the emergent properties of a system, and have become increasingly used as indicators of ecosystem health. Within the context of fisheries management, changes in spectrum slopes have been associated with fishing exploitation, primarily through the targeted removal of larger individuals (@bianchi2000; @shin2005). Numerical experiments have also linked changes in slope to environmental disturbances (@guiet2016). Size spectra have also been shown to express predictable relationships between ecosystems of similar productivity levels as well as from distinct temperatures (@guiet2016).**Use Pomerantz paper & Edwards to extend into ISD**## Temperature & EcologyTemperature plays a critical role on biological life impacting many of the chemical reactions that underpin basic physiological function. Temperature has direct and indirect impacts on critical biological functions including the acquisition of biomass through feeding, the rates of biomass loss through metabolism, and the rates at which individuals mature and develop. Because of these relationships, most species have evolved thermal preferences around which these chemical reactions are optimized. Species that are unable to maintain their thermal preferences internally must be able to follow their thermal preference in the environment through locomotion or adapt to less-favorable conditions through changes in behavior or risk metabolic costs in failing to do so. In an era of anthropogenic climate change, there is an expectation that many species will be displaced from historic habitats in their efforts to follow their thermal preferences. Recent research in marine environments has shown evidence of this as species are now shifting to higher latitudes and to deeper depths in the pursuit of more favorable conditions (@kleisner2017; @pinsky2013). Others have suggested that temperature related impacts may not be seen through geographic distribution change, but through physiological changes, changes in seasonal phenology, or in dashed hopes of species recovery (@daan2005; @miller2018; @pershing2015; @meyer-gutbrod2021).**Need to connect temperature to size here**Direct quote from @guiet2016 , but nails the connection back to temp expectations:> Because it controls chemical reactions, temperature controls metabolic rates which underpin maintenance, growth or reproduction (Clarke and Johnston, 1999; Kooijman, 2010) as well as the functional responses to food density (Rall et al., 2012). Guiet et al. (2016)... In addition to the impact of temperature on communities' intercepts (heights), the impact of temperature on the speed of the energy flow within communities may affect other properties, such as their resilience to perturbations or the intensity of trophic cascades (Andersen and Pedersen, 2009).The potential for elevated temperatures to impact the size structure of an ecosystem has implications for the ecosystem resilience in the face of climate change, as well as the blue economies & natural resource systems that rely upon their good health.### Temperature of the Gulf of Maine & NE ShelfIn addition to the ecological disturbances of industrial fisheries, the Northwest Atlantic is also one of the fastest warming locations in the global oceans. Sea surface temperatures in the Gulf of Maine since 1982 have been warming at rates faster than 96% of the world's oceans, with similar warming rates along the northwest Atlantic shelf (@pershing2018). This persistent elevated temperature regime of the area is a result of several forces, a combination of shifting ocean currents and the unique bathymetry of the region. A Northward shift in the Gulf Stream directly increased the regional temperatures through increased transport of warm Gulf Stream water into areas like the Gulf of Maine. The Northward Gulf Stream shift is associated with a higher frequency of warm core rings, and the obstruction of cold-water Scotian Shelf current flow that would otherwise counter the influence of the Gulf Stream on the region's temperatures (@gangopadhyay2019; @meyer-gutbrod2021). The combination of these oceanographic changes has led to a warmer continental shelf habitat.The rapid warming in the northwest Atlantic is a major factor in the redistribution of marine species along the US east coast. Species have responded by adjusting the timing and locations of their seasonal migrations and shifting their geographic ranges (@nye2009; @staudinger2019). There is evidence that warming has hampered fisheries recoveries as well. Adding a metabolic tax to physiological pathways like growth and metabolism. Species like Black Sea Bass, Atlantic shortfin squid, and Blue crab have been high-profile examples of species expanding their ranges to follow their thermal preferences. While species like the American lobster have shown declines at their southern range near Long Island Sound, with much doubt whether they will recover under the present temperature trends. The recent regime shift in the physical oceanography has also shown to be a catalyst for biological shifts as well (@meyer-gutbrod2021; @perretti2017).While these examples show that species can respond to changes in the physical environment around them through movement & behavior, research elsewhere suggests that physiological responses integrated across species will manifest as changes in community size structure.## Fishing Impacts on Size/Species Composition**NEED fishing impacts on size structure references**### Species Trends in the Northeast Atlantic ShelfThe continental shelf groundfish community in Northwest Atlantic has changed dramatically over the last century. Stocks that supported international fishing effort collapsed, and recovery efforts fell short of their objectives. Research on Georges Bank estimated that biomass more than halved in the 1960's (pre-dating federal monitoring efforts), and noted a species replacement of commercial groundfish target species by skate and dogfish (@fogarty1998). Industrial fishing is inherently size-selective, with larger individuals selectively removed from the population. This has an immediate impact on the community size-distribution with additional impacts on the future population as well. Larger individuals have a greater impact on population recovery, capable of holding more (and often of higher quality) eggs. Size-based harvest in fisheries has been shown to create selective pressures that promote characteristics of early maturation at smaller sizes **add_citation**.**reference gb spectra early work**### PurposeWith the understanding that populations depend on the health of their ecosystems, there is a need to have community-wide metrics to effectively understand and manage marine resources. Size based indices are metrics that can be estimated from the information that has historically been available from long-term survey efforts. These indices have been shown to be sensitive to the impacts of fishing, but should also capture environmentally driven stress as well. We estimated size spectrum relationships as SBI's for the groundfish populations for each sub-region of the Northeast US continental shelf. In the case of the NW Atlantic sustained increases in temperature should have a physiological impact on the community size structure.This leads to our second hypothesis:> H2. Warming alters the community through the direct influence of temperature on metabolism, growth, and population productivity.# Methods## Fish Data Source and Processing```{r}#| label: load-survdat# 1. Biological data used as inputwithr::with_dir(rprojroot::find_root('_targets.R'), tar_load(catch_log2_labelled)) # rename and formatcatch_size_bins <- catch_log2_labelled %>%fill_func_groups()# Get total number of speciesn_species <-length(unique(catch_size_bins$comname))```Data on the biomass, abundance, and size of fish on the Northeast U.S. Shelf were collected as part of the Northeast Fisheries Science Center's bottom trawl survey ( @grosslein1969, @Azarovits1981, @politis2014 ) This survey is conducted from Cape Hatteras, North Carolina to the Gulf of Maine each year in the spring and in the fall. The survey follows a stratified random sampling design, with strata defined based on depth, bottom habitat, and latitude. Trawls are performed for a fixed duration at each station, reporting aggregate abundance and biomass for all species caught, and measuring individual lengths and weights for the catch of each species or a sub-sample if that catch is large. Correction factors were applied to aggregate species abundance and biomass to account for changes in vessels, gear, and doors used in the survey over time ( @sissenwine1978, @byrne1991, @miller2010 ) However, abundance and biomass at length needed to be estimated after these aggregate corrections. As such, abundance at length for each species was adjusted to match the corrected aggregate species abundance at each station, such that for each species, the sum of the resulting estimated abundance numbers across each length is equal to the corrected aggregate abundance.```{r}#| label: study-area-maps#------------- Trawl Strata --------------# Get the paths to the shapefiles used to masktrawl_paths <- gmRi::get_timeseries_paths(region_group ="nmfs_trawl_regions", box_location ="cloudstorage")# Polygons for each regionall_poly <-read_sf(trawl_paths[["inuse_strata"]][["shape_path"]]) # Make the Mapstrata_map <-ggplot() +geom_sf(data = us_poly) +geom_sf(data = canada) +geom_sf(data = all_poly, aes(fill = survey_area), alpha =0.8) +coord_sf(xlim =c(-76.4, -64.4), ylim =c(35, 45.5), expand = F) +scale_fill_gmri() +theme_bw() +map_theme(legend.position =c(0.6, 0.125), legend.background =element_rect(fill ="white"),plot.caption =element_text(hjust =0)) +guides(fill =guide_legend(title ="", nrow =2)) # ----------- Landings Statistical Zones -----------------# Shapefiles for the fisheries stat zonesstat_zones <-read_sf(str_c(res_path, "Shapefiles/Statistical_Areas/Statistical_Areas_2010_withNames.shp"))# Make a list of zones to roughly match the survey areas:fish_zones <-list("Gulf of Maine"=c(511:515, 464, 465),"Georges Bank"=c(521, 522, 525, 561, 562),"Southern New England"=c(611, 612, 613, 616, 526, 537, 538, 539),"Mid-Atlantic Bight"=c(614:615, 621, 622, 625, 626, 631, 632))# Trim out what we don't need and label themstat_zones <- stat_zones %>%mutate(survey_area =case_when( Id %in% fish_zones$"Gulf of Maine"~"Gulf of Maine", Id %in% fish_zones$"Georges Bank"~"Georges Bank", Id %in% fish_zones$"Southern New England"~"Southern New England", Id %in% fish_zones$"Mid-Atlantic Bight"~"Mid-Atlantic Bight")) %>%filter(survey_area %in%c("Georges Bank", "Gulf of Maine", "Southern New England", "Mid-Atlantic Bight"))# Map it out? - takes forever from Chesapeake baygarfo_map <-ggplot(stat_zones) +geom_sf(aes(fill = survey_area), alpha =0.8) +geom_sf(data = us_poly) +geom_sf(data = canada) +coord_sf(xlim =c(-76.4, -64.4), ylim =c(35, 45.5), expand = F) +scale_fill_gmri() +theme_bw() +map_theme(legend.position =c(0.6, 0.125),legend.background =element_rect(fill ="white"),plot.caption =element_text(hjust =0)) +guides(fill =guide_legend(title ="", nrow =2)) # ggsave(garfo_map, filename = here::here("R/nmfs_size_spectra/images/garfo_regions.png"), bg = "white")# Map them together(strata_map | garfo_map) +plot_layout(guides ="collect")```### Community Composition & Functional GroupsAnalyses were performed using 68 species. These species were selected based on the availability of published weight-at-length relationships (Wigley et al. 2003) and represented 98.98% of the total biomass caught in the survey. Each species was assigned to a functional group based on life history and geography using the definitions of (@hare2016). Functional groups included coastal fish, diadromous fish, elasmobranch, groundfish, and pelagic fish (@tbl-functional-groups). Six species with available length-weight details did not have a functional group designation, these species were designated as **reef species**. Exploratory analyses showed that the pelagic species biomass was low in all regions, and is unlikely to be representative of true biomass trends due to gear selectivity.```{r}#| label: biomass-group-summaries# This cell provides the group summaries for how much biomass exists in each size/functional group# Actual log10 bins# Do some grouping by year survey area and functional groupfgroup_area <-log2_bin_summaries(data_in = catch_size_bins, Year, survey_area, hare_group)fgroup_all <-log2_bin_summaries(data_in = catch_size_bins, Year, hare_group)fgroup_all <- fgroup_all %>%map(~mutate(.x, survey_area ="Northeast Shelf, Full"))# Join the overall values back infgroup_area <-map2(fgroup_area, fgroup_all, bind_rows)```Published length-weight relationships ( Wigley et al. 2003 ) were used to convert from length data, available for all individuals, into their corresponding biomass-at-length. To account for differences in sampling effort among survey strata, all corrected abundance-at-length data were area-stratified. Area-stratified biomass-at-length values were then computed as the product of area-stratified abundance-at-length and estimated weight-at-length. All analyses were performed using area-stratified abundances and their associated area-stratified biomass estimates, hereby referred to as simply abundance-at-length & biomass-at-length<!--# Everything above here in methods describes the data source, and its pre-processing, Everything below details what is done with it -->## Community Composition MetricsOur analyses used all data collected during the spring and fall surveys from 1970-2019. Data were grouped using survey-design strata into four sub-regions: Gulf of Maine, Georges Bank, Southern New England, Mid-Atlantic Bight. These sub-regions have been widely used in regional ecological studies (e.g., ). For each region, we developed several time series indicators:1. Community Composition Metrics (regional abundance and biomass by functional group)2. Average length and weight of the aggregate community and each functional group3. Exponent of the Individual Size Distribution Spectra### Body Size ChangesThe average body length (cm) and body weight (kg) was calculated for all catch for each region and within each functional group. Each body-size metric was weighted by the abundance-at-length of a species sampled at each station such that.$$X_j = \frac{ \sum_{}^{} n_iX_{ij} }{ \sum~w_i }$$ Where $X$ is the body-size metric, $j$ is the year, & $n$ is the abundance for each station $i$.Data for body size trends were not truncated using any minimum or maximum size and reflect all available catch data for the `r n_species`, for which biomass-at-length could be estimated.```{r}#| label: body size data# Load the average body size datawithr::with_dir(rprojroot::find_root('_targets.R'), tar_load(mean_sizes_ss_groups)) # Overall across all regionsshelf_size <- mean_sizes_ss_groups %>%filter(`group ID`=="single years") %>%mutate(survey_area ="Northeast Shelf, Full")# Grouped on year and regionregional_size <- mean_sizes_ss_groups %>%filter(`group ID`=="single years * region") %>%bind_rows(shelf_size) %>%mutate(Year =as.numeric(Year),survey_area =as.character(survey_area)) %>%left_join(area_df) %>%mutate(area_titles =factor(area_titles, levels = area_levels_long),area_copy = survey_area)# 2. Grouped on year, region, & functional groupfunctional_group_size <- mean_sizes_ss_groups %>%filter(`group ID`=="single years * region * functional group") %>%mutate(Year =as.numeric(Year),survey_area =factor(survey_area, levels = area_levels))# Refactor areasfunctional_group_size <- functional_group_size %>%mutate(survey_area =as.character(survey_area)) %>%left_join(area_df) %>%mutate(area_titles =factor(area_titles, levels = area_levels_long),area_copy = survey_area)```### Estimating Size Spectra```{r}#| label: load the size spectrum indiceswithr::with_dir(rprojroot::find_root('_targets.R'), tar_load(size_spectrum_indices)) # Grab the overall shelf-wide SSshelf_ss <- size_spectrum_indices %>%filter(`group ID`=="single years") %>%mutate(survey_area ="Northeast Shelf, Full")# Grab SS Groups for each regionregion_indices <- size_spectrum_indices %>%filter(`group ID`=="single years * region") %>%bind_rows(shelf_ss) %>%mutate(yr =as.numeric(as.character(Year)),survey_area =factor(survey_area, levels = area_levels),sig_fit =ifelse(log2_sig_strat <0.05, "Significant", "Non-Significant"))# Make a copy so we can gray out the lines in plots# Merge the long names for labels "area_titles"region_indices <- region_indices %>%mutate(survey_area =as.character(survey_area)) %>%left_join(area_df) %>%mutate(area_titles =factor(area_titles, levels = area_levels_long),area_copy = survey_area)```Lengths of individuals in the catch data are measured to the nearest cm, with smaller specimens measured to the nearest millimeter. Because individual biomass is estimated from those length measurements, there is a range of possible body mass values between the cm & mm increments. The relationship between length and mass in fishes is exponential and taxon specific, so biases resulting from using only the lower or upper end of those ranges is different for each taxon and increases for larger taxa. To account for this and reduce biases, we used the extended likelihood method (MLEbin) of @edwards2020. This method estimates the exponent of size spectra (b) for a bounded power law relationship between abundance and our length-derived estimates of individual biomass.The exponent of this relationship is analogous to slope estimates from linear regression on logarithmic axes, with a steeper slope (i.e. a more negative b) indicating fewer large-bodied individuals and/or more small-bodied fishes @carvalho2021. This method has been shown to be the most accurate for estimating the exponent of size spectra when tested against alternative methods that require arbitrary decisions around binning the data (@white2008; @sprules2016; @edwards2017 ).$$\begin{align*}f(x) = \frac{ (\lambda + 1)x^{\lambda} }{ x^{\lambda+1}_{max} - x^{\lambda+1}_{min} }~~~~~~\lambda\neq1,\\\\f(x) = \frac{1}{logx_{max} - logx_{min}}~~~~~~\lambda=1\end{align*}$${#eq-ISD}Using this method, the size spectra exponent (b) was estimated for each survey region from 1970 to 2019. A minimum biomass of 1g was used for the lower limit and a maximum biomass of 10kg was used as an upper limit for the ISD's bounded power law probability density function @eq-ISD, where $x$ is body mass & $\lambda$ is the scaling exponent of the ISD. The biomass within these limits constitutes 97.83% of all estimated biomass for the `r n_species` used in this study. This truncation accounts for poor gear selectivity at the smallest and largest size ranges and imposes shared limits to the size distribution we'd expect to sample across these different areas. Exponents of size spectra were calculated using code modified from the sizeSpectra package in R (@edwards2017; @edwards2020).## Drivers of Size Spectrum ChangeThe impact of external factors on the changes in size spectra was correlated against several hypothesized driving forces related to both environmental regimes and anthropogenic disturbances. Potential large-scale environmental drivers include regional SST anomalies & the larger-scale impact of the Gulf Stream Index (GSI). Fishing pressure represents the primary top-down anthropogenic driver in the region, and was investigated using the aggregate regional commercial landings data. The potential bottom-up impacts of zooplankton community differences were incorporated at a regional scale using small and large zooplankton indices from the EcoMon plankton survey.### Gulf Stream IndexGulf stream index (GSI) values were obtained from the ecodata package in R (Bastille & Hardison 2018) . This package supplies GSI data at monthly intervals following the methodology of @pérez-hernández2014 and @joyce2019 , using as sea level height anomaly data from the Copernicus Marine Environment Monitoring Service```{r}#| label: gsi-data-prep#| eval: true#### 2. Climate Drivers ##### From ecodata: GSI & Zooplankton#### GSI# Gulf Stream Indexgsi <- ecodata::gsi %>%mutate(Time =str_pad(as.character(Time), width =7, side ="right", pad ="0"),year =as.numeric(str_sub(Time, 1, 4)),month =str_sub(Time, -2, -1),Time =as.Date(str_c(year, month, "01", sep ="-")))```### Sea Surface Temperature DataGlobal Sea surface temperature data was obtained via NOAA's optimally interpolated SST analysis (OISSTv2), providing daily temperature values at a 0.25° latitude x 0.25° longitude resolution (Reynolds et al. 2007). A daily climatology for every 0.25° pixel in the global data set was created using average daily temperatures spanning the period of 1982-2011. Daily anomalies were then computed as the difference between observed temperatures and the daily climatological average. OISSTv2 data used in these analyses were provided by the NOAA PSL, Boulder, Colorado, USA from their website at [https://psl.noaa.gov/data/gridded/data.noaa.oisst.v2.html](https://www.google.com/url?q=https://psl.noaa.gov/data/gridded/data.noaa.oisst.v2.html&sa=D&source=docs&ust=1675370947039361&usg=AOvVaw1ShFZYjArUTbnJyV_pkcaK).Sea surface temperature data was regionally averaged to match the survey regions from the age-at-length data. SST anomalies were averaged by year for each region and over the entire sampling region to produce daily time series. These time series were then processed into annual timeseries of surface temperatures and anomalies. All region-averaging was done with area-weighting of the latitude/longitude grid cells to account for differences in cell-size in the OISSTv2 data.```{r}#| label: load temperature data# Load the regional temperatures from {targets}withr::with_dir(rprojroot::find_root('_targets.R'), tar_load(regional_oisst))# Get regional averagestemp_regimes <- regional_oisst %>%filter(yr >1981) %>%mutate(regime =ifelse(sst_anom >0, "hot", "cold"),survey_area =ifelse( survey_area =="all", "Northeast Shelf, Full", survey_area),survey_area =factor(survey_area, levels = area_levels)) ```### Commercial FishingFishing pressure in the region was indexed using state and federal commercial fishing landings. These data were obtained from the Greater Atlantic Regional Fisheries Office (GARFO) for statistical areas that are routinely used for fisheries reporting and management (Figure X). Individual statistical areas were aggregated into regions that closely align with the survey areas we defined for the size spectra analyses (Figure X).```{r}#| label: garfo-data-prep#| eval: true#### GARFO Landings ##### Load the GARFO landings data:# Landings of finfish* sheet 5landings <-read_xlsx(path =str_c(res_path, "GARFO_landings/KMills_landings by area 1964-2021_JUN 2022.xlsx"), sheet =5) %>%rename_all(tolower)# Add the labels into the landings data and remove what we don't need there:landings <- landings %>%mutate(survey_area =case_when(`stat area`%in% fish_zones$"Gulf of Maine"~"Gulf of Maine",`stat area`%in% fish_zones$"Georges Bank"~"Georges Bank",`stat area`%in% fish_zones$"Southern New England"~"Southern New England",`stat area`%in% fish_zones$"Mid-Atlantic Bight"~"Mid-Atlantic Bight")) %>%filter(survey_area %in%c("Georges Bank", "Gulf of Maine", "Southern New England", "Mid-Atlantic Bight")) %>%mutate(survey_area =factor(survey_area, area_levels_long))# Get Summarieslandings_summ <- landings %>%drop_na() %>%rename("weight_lb"=`landed lbs`,"live_lb"=`live lbs`) %>%drop_na() %>%group_by(year, survey_area) %>%summarise( across(.cols =c(value, weight_lb, live_lb), .fns =list(mean =~mean(.x , na.rm = T), total =~sum(.x , na.rm = T)), .names ="{.fn}_{.col}"), .groups ="drop") # Scale the landings to create an index by arealandings_summ <- landings_summ %>%group_by(survey_area) %>%mutate(total_wt_z = base::scale(total_weight_lb)) %>%ungroup()```### Zooplankton Abundance IndicesZooplankton data are from the National Oceanographic and Atmospheric Administration Marine Resources Monitoring, Assessment and Prediction (MARMAP) program and Ecosystem Monitoring (EcoMon) cruises detailed extensively in Kane (2007), Kane (2011), and Morse et al. (2017). Abundance anomalies are computed from the expected abundance on the day of sample collection. The small copepod index is computed by averaging the individual abundance anomalies of Pseudocalanus spp., Centropages hamatus, Centropages typicus, and Temora longicornis. The large-copepod anomalies is the abundance anomaly of Calanus finmarchicus, the largest copepod in the Northeast U.S. region. Anomalies for both large and small zooplankton groups are then averaged within ecological production units for annual timeseries.```{r}#| label: zp-data-prep# ADD Zooplankton: small and large zooplankton anomalies for the EPUS# Don't need to scale# Load original datazp_sli <- ecodata::zoo_sli_anom %>%filter(EPU !="SS")# Reformat to match the other indiceszp_index <- zp_sli %>%#EPUS combine SNE and MAB, we can repeat the values herebind_rows(filter(zp_sli, EPU =="MAB") %>%mutate(EPU ="SNE")) %>%mutate(survey_area =case_when( EPU =="GB"~"Georges Bank", EPU =="GOM"~"Gulf of Maine", EPU =="MAB"~"Mid-Atlantic Bight", EPU =="SNE"~"Southern New England"),Units =str_c("zp_", Var)) %>%select(year = Time, survey_area, Value, Units) %>%pivot_wider(names_from = Units, values_from = Value)``````{r}#| label: combine-drivers-table#### Build a Common form for all Drivers: ##### Format: year, area, var, value#--------------------#### Reshape all the inputs so that we can bind them together in one dataframe# 1.Climate Features# Join the climate modes togetherclim_drivers <- gsi %>%filter(EPU =="All")# Put climate drivers on annual scheduleclim_idx <- clim_drivers %>%group_by(year, area = EPU, var = Var) %>%summarise(value =mean(Value, na.rm = T),.groups ="drop")#--------------------# 2. GARFO landingslandings_idx <- landings_summ %>%group_by(year, area = survey_area) %>%summarise(value =mean(total_wt_z, na.rm = T),.groups ="drop") %>%mutate(var ="landings")#--------------------# 3. Temperaturesst_idx <- temp_regimes %>%left_join(area_df, by ="survey_area") %>%select(year = yr, area, value = sst_anom) %>%mutate(var ="sst_anom")#--------------------# 4. Zooplanktonzp_idx <- zp_index %>%pivot_longer(cols =c(zp_small, zp_large), names_to ="var", values_to ="value") %>%rename(area = survey_area)#--------------------# 4. Size Spectrum Slopes & Interceptsss_idx <- region_indices %>%select(year = Year, survey_area, spectra_slope = log2_slope_strat, #spectra_int = log2_int_strat,isd_slope = b) %>%pivot_longer(# cols = c(spectra_slope, spectra_int, isd_slope), cols =c(spectra_slope, isd_slope), names_to ="var", values_to ="value") %>%mutate(year =as.numeric(year)) %>%left_join(area_df, by ="survey_area") %>%select(-survey_area)#--------------------# All Metrics Togetherall_drivers <-bind_rows(list( clim_idx, landings_idx, sst_idx, zp_idx, ss_idx))``````{r}#| label: driver-scaling# In this chunk the drivers get reshaped to scale# Scaling should cover the period of analysis 1982-2019# SST and GSI (all index/anomaly fields) are not scaled# Put the data in a wide form for scaling:drivers_wide <- all_drivers %>%filter(year >=1982, year <=2019) %>%mutate(id =str_replace_all(str_c(area, "_", var), " ", "_")) %>%select(-c(area, area_titles, var)) %>%pivot_wider(names_from = id, values_from = value) %>%column_to_rownames(var ="year")# Scale the landings, not SST or GSIto_scale <-select(drivers_wide, ends_with("landings")) %>%scale() %>%as.data.frame()dont_scale <-select(drivers_wide, -ends_with("landings"))# Join back togetherdrivers_scaled <-bind_cols(dont_scale, to_scale) %>%rownames_to_column(var ="year")```## Analysis of Driver Impacts### Driver Cross Correlation FunctionsIn addition to evaluating the direct and contemporaneous impacts of each potential driver, additional predictors were identified for inclusion in the model-selection process through the use of cross correlation function (CCF) estimates and changepoint analysis. CCF estimates were used to look at correlation at one-year lag intervals to investigate potential time-delayed effects between the hypothesized drivers and the community’s size spectra. CCF estimates were performed between the dependent variables of the size spectra slope with the independent variables of the annual gulf stream index, the corresponding regional commercial landings and sea surface temperature anomalies, and both small and large zooplankton indices. Annual time lags on independent variables of up to 10 years that were identified in the CCF analysis were then included as additional candidate predictors in the regression analysis on drivers of the spectrum slope changes.```{r}#| label: ccf-data#| fig-height: 8#| fig-width: 8# Run CCF function to get lagged correlations,# Put that data in one table to plot in a better comparison form#------------ Processing Data for CCF. ------------------------# Scale Everything Before CCF,# Reshape the Drivers Wide to scale# Scale the columns independently using the variance in all years, but within an area# Are the columns scaling independently or together?# Confirmed independently scaling# Drivers scaled trimmed is a matrix of all the different timeseries# Columns that end with slope or intercept are pulled out to get a column of the response variables (slope features)# Columns that don't contain "spectra" or "year" capture the remaining features# Next we pull out the region information for eachdriver_ccf_prep <- drivers_scaled %>%select(-All_sst_anom) %>%#rownames_to_column(var = "year") %>% pivot_longer(names_to ="spectra_param", values_to ="spectra_values", cols =ends_with("slope") |ends_with("int")) %>%pivot_longer(names_to ="driver_var", values_to ="driver_values", cols =-matches("spectra|year")) %>%mutate(# C. These are the areas associated with the Spectra Featuresspectra_area =case_when(str_detect(spectra_param, "All") ~"All",str_detect(spectra_param, "Georges") ~"Georges Bank",str_detect(spectra_param, "Gulf") ~"Gulf of Maine",str_detect(spectra_param, "Southern") ~"Southern New England",str_detect(spectra_param, "Mid-Atlantic") ~"Mid-Atlantic Bight"),spectra_area =factor(spectra_area, levels =c("All", "Gulf of Maine", "Georges Bank", "Southern New England", "Mid-Atlantic Bight")),# D. These are the areas of the driversdriver_area =case_when(str_detect(driver_var, "All") ~"All",str_detect(driver_var, "Georges") ~"Georges Bank",str_detect(driver_var, "Gulf") ~"Gulf of Maine",str_detect(driver_var, "Southern") ~"Southern New England",str_detect(driver_var, "Mid-Atlantic") ~"Mid-Atlantic Bight"),driver_area =factor(driver_area, levels =c("All", "Gulf of Maine", "Georges Bank", "Southern New England", "Mid-Atlantic Bight"))) # Filter it out so there is only cases where the driver area matchesdriver_ccf_prep <- driver_ccf_prep %>%filter(driver_area == spectra_area | driver_area =="All") %>%arrange(year, driver_var)# Walk through each xvariable, and see how it correlates with each yvar# split on xvar# then split on yvars# make sure order is goodccf_relationships <- driver_ccf_prep %>%drop_na() %>%split(.$spectra_param) %>%map_dfr(function(x_param){ x_param %>%split(.$driver_var) %>%map_dfr(function(driver_y_data){ ccf_df <-get_ccf_vector(x = driver_y_data$spectra_values,y = driver_y_data$driver_values ) }, .id ="driver_var") }, .id ="spectra_param")# Build back out the labels for plottingccf_plot_data <- ccf_relationships %>%mutate(# Flag what the driver type wasdriver_type =case_when(str_detect(driver_var, "landings") ~"Commercial Landings",str_detect(driver_var, "sst") ~"SST",str_detect(driver_var, "index") ~"GSI",str_detect(driver_var, "small") ~"ZP-S",str_detect(driver_var, "large") ~"ZP-L",TRUE~"Missed Something"),param_feature =case_when(# Flag what the Size Distribution Parameter wasstr_detect(spectra_param, "int") ~"Spectra Intercept",str_detect(spectra_param, "isd") ~"ISD Exponent",str_detect(spectra_param, "slope") ~"Spectra Slope",TRUE~"Missed Something"),spectra_region =case_when(# Flag what region the driver was coming fromstr_detect(spectra_param, "All") ~"All",str_detect(spectra_param, "Georges") ~"Georges Bank",str_detect(spectra_param, "Gulf") ~"Gulf of Maine",str_detect(spectra_param, "Southern") ~"Southern New England",str_detect(spectra_param, "Mid-Atlantic") ~"Mid-Atlantic Bight"),# Flag when it crosses thresholdsig_flag =ifelse(acf < signeg | acf > sigpos, T, F),# Set Factor Levelsspectra_region =factor(spectra_region, levels =c("All", "Gulf of Maine", "Georges Bank", "Southern New England", "Mid-Atlantic Bight")))# Limit to one Response:ccf_plot_data <-filter(ccf_plot_data, param_feature =="ISD Exponent") %>%filter(spectra_region !="All")# Plot themccf_plot_data %>%ggplot(aes(x = lag, y = acf, group = driver_var, color = driver_type, fill = driver_type)) +geom_col(alpha =0.65) +geom_col(data =filter(ccf_plot_data, sig_flag),aes(lag, acf, fill = driver_type, group = driver_var), color ="black") +geom_text(data =filter(ccf_plot_data, sig_flag),aes(lag, y =0, label = lag, vjust =ifelse(acf <0, 1.5,-1.5)), color ="black") +geom_line(aes(x = lag, y = sigpos), linetype =2, color ="gray25") +geom_line(aes(x = lag, y = signeg), linetype =2, color ="gray25") +geom_hline(yintercept =0, color ="gray25", linewidth =1) +geom_vline(xintercept =0, color ="black", linewidth =1) +scale_color_gmri() +scale_fill_gmri() +facet_grid(spectra_region~driver_type*param_feature) +scale_x_continuous(limits =c(-15, 15), breaks =seq(from =-15, to =12, by =1)) +theme(legend.position ="bottom",axis.text =element_text(size =8)) +labs(x ="Driver Lag", y ="Correlation (CCF)", fill ="Driver Type:", color ="Driver Type:",caption ="All predictors scaled over year range of 1982-2019. Lags exceeding significance threshold outlined and numbered.")```### Slope Changepoint AnalysesChangepoint analysis was used as a means to explore the temporal structure in changes of the regional size spectrum slopes. Changepoint analysis was performed using the envcpt package (@killick et al), testing comparing support for 12 candidate model structures and any identified changepoints. The most supported changepoint structures were then used to inform whether (if any) regional spectra expressed distinct periods of behavior. If changepoints were identified, this was used as evidence that there may have been a community regime change. Following this thinking it is plausible for there to be inconsistencies or non-stationarity in the predictors' impact on the community spectra, as different community regimes may be differently vulnerable to top-down pressures like commercial fishing. To allow for changes in these relationships across different regimes, separated by changepoints, an additional interaction term was added to the regression model candidates for that region. ### Driver Correlation Periods```{r}#| label: driver-correlation-periods# Pull different years into a listdriver_periods <-list("1982-1998"=c(1982:1998),"1999-2006"=c(1999:2006),"2007-2019"=c(2007:2019),"1982-2019"=c(1982:2019))# -------------------------------------------------# Idea: Add functional_group_sizes into the correlation chart:# Reshape the size information so we can combine it with the drivers:# Values are scaled heresizes_wide <- functional_group_size %>%select(year = Year, survey_area, hare_group, avg_weight = mean_wt_kg) %>%pivot_longer(names_to ="var", values_to ="value", cols =c(avg_weight)) %>%inner_join(area_df, by ="survey_area") %>%select(-survey_area) %>%mutate(id =str_replace_all(str_c(area, "_", hare_group, "_", var), " ", "_")) %>%select(-c(area, area_titles, var, hare_group)) %>%pivot_wider(names_from = id, values_from = value, values_fn = {mean}) %>%column_to_rownames(var ="year") %>%scale() %>%as.data.frame()# Join regional size changes to the scaled driversdrivers_and_sizes <- drivers_scaled %>%left_join(rownames_to_column(sizes_wide, var ="year"), by ="year") %>%column_to_rownames(var ="year")# Split them out into a list that contains data for each period# Takes the years in driver_periods and pulls the rows that corresponddriver_period_list <-map(driver_periods, function(yrs){ period_matrix <- drivers_and_sizes[rownames(drivers_and_sizes) %in% yrs, ] %>%drop_na()})```### Regression AnalysesFollowing exploratory analyses with CCF functions and changepoint analyses, regression models were developed and tested independently for each region. Potential drivers included in the models were the annual Gulf Stream Index, the annual regionally averaged SST anomaly value, regionally averaged small and large copepod indices, and the scaled annual commercial landings from that region. Gulf stream index, zooplankton indices, and SST anomalies were not standardized prior to use in model fitting and evaluation as the two former are already an index value and the latter is scaled to a thirty-year climatological reference period. In the event that CCF analysis identified an important time lag for a region’s independent variables, these specific time-lagged predictors were included as predictors in the candidate models, otherwise time-lags on predictors were not evaluated. For any region(s) that showed support for changepoints in their temporal structure, the periods separated by those breaks were included as an interaction term on all other predictors. In the absence of any identified changepoints, no multi-year periods were included as potential regime impacts. All candidate models for a region were evaluated using AIC to determine the most parsimonious models for each regional system. ```{r}#| label: regression-analysis-data# Use scaled drivers for the regressions# Need to drop zooplankton because they are non-unique since we split the MAB EPU and repeated itregression_df <- drivers_scaled %>%select(!ends_with(c("zp_small","zp_large")))# Build back out the metadata:# Identify what region is associated with both: spectrum values & driver valuesregression_df <- regression_df %>%pivot_longer(names_to ="spectra_param", values_to ="spectra_values", cols =ends_with("slope") |ends_with("int")) %>%pivot_longer(names_to ="driver_var", values_to ="driver_values", cols =-matches("spectra|year")) %>%mutate(year =as.numeric(year),# A. Flag what the driver type wasdriver_type =case_when(str_detect(driver_var, "landings") ~"Commercial Landings",str_detect(driver_var, "sst") ~"SST",str_detect(driver_var, "index") ~"GSI",TRUE~"Missed Something"),# B. Flag what the spectrum feature wasparam_feature =case_when(# Flag what the Size Distribution Parameter wasstr_detect(spectra_param, "int") ~"Spectra Intercept",str_detect(spectra_param, "isd") ~"ISD Exponent",str_detect(spectra_param, "slope") ~"Spectra Slope",TRUE~"Missed Something"),# C. These are the areas associated with the Spectra Featuresspectra_area =case_when(str_detect(spectra_param, "All") ~"All",str_detect(spectra_param, "Georges") ~"Georges Bank",str_detect(spectra_param, "Gulf") ~"Gulf of Maine",str_detect(spectra_param, "Southern") ~"Southern New England",str_detect(spectra_param, "Mid-Atlantic") ~"Mid-Atlantic Bight"),spectra_area =factor(spectra_area, levels =c("All", "Gulf of Maine", "Georges Bank", "Southern New England", "Mid-Atlantic Bight")),# D. These are the areas of the driversdriver_area =case_when(str_detect(driver_var, "All") ~"All",str_detect(driver_var, "Georges") ~"Georges Bank",str_detect(driver_var, "Gulf") ~"Gulf of Maine",str_detect(driver_var, "Southern") ~"Southern New England",str_detect(driver_var, "Mid-Atlantic") ~"Mid-Atlantic Bight"),driver_area =factor(driver_area, levels =c("All", "Gulf of Maine", "Georges Bank", "Southern New England", "Mid-Atlantic Bight")),# E. Add Decade indecade =floor_decade(year),##regime = ifelse(year < 2010, "regime_1", "regime_2" ) ) # Filter it so predictors are only within matching regions, # or just overarching (GSI)regression_df <- regression_df %>%filter(driver_area == spectra_area | driver_type =="GSI") %>%arrange(year, driver_var)# Last wrangling:# columns for: year, slope, isd, spectra region, drver_region, GSI, sst_anom, landings# GSI is so annoying here, just rejoin it without the area columnregression_df <- regression_df %>%select(-c(spectra_param, driver_var)) %>%pivot_wider(names_from ="driver_type", values_from ="driver_values") %>%pivot_wider(names_from ="param_feature", values_from ="spectra_values") %>%select(-GSI) %>%filter(driver_area !="All") %>%left_join( select(clim_idx, year, GSI = value), by =c("year") ) %>%mutate(spectra_area =fct_drop(spectra_area))# Add zooplankton here so it can be checked in the regressions# Zooplankton values are duplicated in MAB and SNE b/c its the same EPUregression_df <-left_join( regression_df, zp_index %>%rename(spectra_area = survey_area), by =join_by(year, spectra_area))```# Results### Community AbundanceStratified abundance estimates have been gradually increasing on the Northeast Shelf since 1970. Abundance growth accelerated after 2007 to peak levels in 2014. Abundance then began declining, which continued through 2019 (Figure 2). In the Gulf of Maine, fish abundance remained relatively-low until the 1990’s, at which point it began to steadily rise–closely aligning with the pattern for the Northeast Shelf. This increase in abundance reversed briefly during 2002-2006, but continued to rise after 2006 before hitting a regional peak in 2016. Populations then declined through 2019 similar to the shelf-wide trend. Georges Bank abundances were consistently low from 1970 around 2010. By 2014 abundance had roughly quadrupled, propelled by strong recruitment classes of haddock. After 2014, abundance quickly fell to numbers more similar to the 1970-2010 levels by the end of the decade. Community abundance in Southern New England displayed higher inter-annual changes across all years compared to both Gulf of Maine and Georges Bank. Abundance in this area showed a less dramatic rise and fall than the Northern regions. Abundances began increasing rapidly here in 2007, before falling back to earlier levels by the end of the 2010’s decade. The Mid Atlantic Bight displayed the most inter-annual variability and had no major trends in fish abundance.```{r}#| label: abundance distributions#| fig.width: 8#| fig-height: 8# panel plot the biomass by body sizefgroup_area$weight_bins <- fgroup_area$weight_bins %>%mutate(strat_abund_mill = wtbin_strat_abund /1e6,strat_lwbio_mill = wtbin_strat_lw_bio /1e6,survey_area =factor(survey_area,levels = area_levels))# Just abundance, not by groupsstrat_abundance_plot <- fgroup_area$weight_bins %>%filter(survey_area =="Northeast Shelf, Full") %>%group_by(Year, survey_area, hare_group) %>%summarise(abund_millions =sum(strat_abund_mill, na.rm = T),.groups ="drop") %>%#filter(survey_area == "GoM") %>% ggplot(aes(Year, abund_millions, fill = hare_group)) +#geom_area(color = "gray30", linewidth = 0.1) +geom_col(color ="gray30", linewidth =0.1, position ="stack") +scale_fill_gmri() +scale_x_continuous(expand =expansion(add =c(0,0))) +scale_y_continuous(labels =comma_format()) +facet_wrap(~survey_area, ncol =1, scales ="free") +labs(y ="Abundance (millions)", x =NULL,fill ="Functional Group") +theme(legend.position ="bottom")# Just Biomass, by area and functional groupstrat_biomass_plot <- fgroup_area$weight_bins %>%filter(survey_area =="Northeast Shelf, Full") %>%group_by(Year, survey_area, hare_group) %>%summarise(abund_millions =sum(strat_lwbio_mill, na.rm = T),.groups ="drop") %>%ggplot(aes(Year, abund_millions, fill = hare_group)) +# geom_area(color = "gray30", linewidth = 0.1) +geom_col(color ="gray30", linewidth =0.1, position ="stack") +scale_fill_gmri() +scale_x_continuous(expand =expansion(add =c(0,0))) +scale_y_continuous(labels =comma_format()) +facet_wrap(~survey_area, ncol =1, scales ="free") +labs(y ="Biomass (million kg)", x =NULL,fill ="Functional Group") +theme(legend.position ="bottom")# Biomass and Abundance Together(strat_abundance_plot / strat_biomass_plot +labs(x ="Year")) +plot_layout(guides ="collect") &theme(legend.position ="bottom")```Groundfish species were the dominant functional group driving the abundance and biomass trends in the Gulf of Maine and Georges Bank, with the two southern regions showing a more balanced abundance distribution among the five functional groups (Figure 2). ```{r}#| label: abundance-by-group#| fig.height: 8#| fig.width: 10#| eval: false# Plot abundance by region and Groupfgroup_area$weight_bins %>%filter(survey_area !="Northeast Shelf, Full") %>%ggplot(aes(Year, strat_abund_mill, fill = left_lim)) +geom_col(color ="gray30", linewidth =0.1) +facet_grid(survey_area~hare_group) +scale_fill_distiller(palette ="RdYlGn", direction =-1, breaks =seq(0,14, 2)) +guides(fill =guide_colorsteps(barwidth =unit(3, "in"))) +scale_y_continuous(labels =comma_format()) +labs(y ="Abundance (millions)", x =NULL, fill ="Minimum Individual Weight 2^x (g)") +theme(legend.position ="bottom",axis.text.x =element_text(angle =45, hjust =1))```### Community BiomassSimilar to abundance, the overall biomass was highest in the two northern regions, the Gulf of Maine and Georges Bank. Roughly half of the biomass sampled in these regions can be attributed to groundfish species, with the second largest contributions coming from elasmobranchs. Within the groundfish biomass, larger individuals \>2kg in particular, declined during the 70's and 80's in these regions, never truly recovering. Beginning in the 2000's there were signs that groundfish abundances were increasing as evidenced by increasing numbers of smaller individuals, however in both regions this trend appears to have reversed by the mid 2010's. Elasmobranch biomass increased steadily throughout the survey time period across all regions, with the exception of southern New England. This area showed large 5-10 year swings in biomass, but no clear long-term trend. Larger elasmobranch were rare in all regions except for a period spanning the late 70's through the early 90's isolated to Georges Bank. Demersal species biomass was highest in the Gulf of Maine, dwarfing their contributions in other regions. Their biomass declined in the 70's, was flat until the late 90's, remaining relatively high until declining in the late 2010's. Pelagic species biomass was low in all regions, and is unlikely to be representative of true biomass trends due to gear selectivity.```{r}#| label: bodymass distributions#| fig.width: 10#| fig-height: 8# panel plot the biomass by body sizeplot_bodymass_vertical <-function(p_area ="GoM", show_facets = F, ylab ="Biomass (million kg)"){ p <- fgroup_area$weight_bins %>%complete(Year, hare_group, survey_area, left_lim) %>%#filter(survey_area != "Northeast Shelf, Full") %>% filter(survey_area == p_area) %>%ggplot(aes(Year, strat_abund_mill, fill = left_lim)) +geom_col(color ="gray30", linewidth =0.1) +# facet_grid(survey_area~hare_group) +facet_grid(hare_group~survey_area, scales ="free") +scale_fill_distiller(palette ="RdYlGn", direction =-1, breaks =seq(0,14, 2), labels =math_format(2^.x)) +guides(fill =guide_colorbar(title.position ="top", title.hjust =0.5, barwidth =unit(4, "in"),frame.colour ="black", ticks.colour ="black")) +scale_y_continuous(labels =comma_format()) +labs(y = ylab, x =NULL, fill ="Minimum Individual Weight 2^x (g)") +theme(legend.position ="bottom",axis.text.x =element_text(angle =45, hjust =1),plot.margin =margin(t =0, r =0, b =0, l =2))if(show_facets == F){p <- p +theme(strip.text.y =element_blank(), strip.background.y =element_blank())}return(p)}# Build the piecesshelf_bmass <-plot_bodymass_vertical("Northeast Shelf, Full", show_facets = F)gom_bmass <-plot_bodymass_vertical("GoM", F, ylab =NULL)gb_bmass <-plot_bodymass_vertical("GB", F, ylab =NULL)sne_bmass <-plot_bodymass_vertical("SNE", F, ylab =NULL)mab_bmass <-plot_bodymass_vertical("MAB", T, ylab =NULL)# patch together( shelf_bmass | gom_bmass |gb_bmass |sne_bmass |gb_bmass |mab_bmass ) +plot_layout(guides ="collect")```### Regional Variation in Species CompositionThere was a distinct difference between Northern and Southern regions in the way biomass was distributed among the different functional groups. The largest contributors to biomass in the southern regions (southern New England & mid-Atlantic bight) was the elasmobranch community. While the northern regions (Gulf of Maine & Georges Bank) each had similar quantities of elasmobranch biomasses, there was also a comparable contribution of groundfish and in the Gulf of Maine there was a major component of demersal species as well.### Body Size TrendsThe average fish size in the Gulf of Maine (length and weight) declined the greatest of all regions over our study period. The average individual length was greatest in the 1970's in the 35-40cm range, falling to 28-33cm over the last decade. Body-weight fell dramatically in the 1980's, from around .75kg in the 1970's to .25-.30kg, roughly a third of what it had been. Georges Bank body sizes also declined during the study period, but less dramatically. Both of these Northern regions had brief period in the late 2000's where average length and weight rose, before falling again in the 2010's. The MAB region was the only region to see a long-term increase in both length and weight during the study period. SNE saw no long-term change in length, and a minor decline in average body-weight.::: panel-tabset#### All Species```{r}#| label: regional-size-trends#| fig.height: 8#| fig.weight: 8# Re-factorregional_size <- regional_size %>%filter(Year <2020) %>%mutate(survey_area =factor(survey_area, area_levels),area_copy = survey_area)# Get the average length and weight at each decadeavg_size_decades <- regional_size %>%mutate(decade_left =as.numeric(as.character(floor_decade(Year))),decade_right =as.numeric(as.character(floor_decade(Year+10))) -1,decade_center = (decade_left + decade_right)/2,decade_lab =str_c(decade_left, "-", decade_right)) %>%group_by(area_titles, decade_left, decade_right, decade_center, decade_lab) %>%summarise(mean_len_cm =mean(mean_len_cm),mean_wt_kg =mean(mean_wt_kg),.groups ="drop" )# Length plotavg_len_p <- regional_size %>%ggplot() +geom_line(aes(Year, mean_len_cm, group = area_titles), linewidth =1,show.legend = F, color ="gray60") +geom_segment(data = avg_size_decades,aes(x = decade_left, xend = decade_right, y = mean_len_cm, yend = mean_len_cm), linewidth =1.5) +geom_text(data = avg_size_decades,aes(x = decade_center, y = mean_len_cm, label =str_c(round(mean_len_cm, 1)) ), vjust =-1.05, size =4, fontface ="bold") +scale_color_gmri() +facet_wrap(~area_titles, ncol =1) +labs(title ="Average Length",y ="Length (cm)",color ="Region")# Weight plotavg_wt_p <- regional_size %>%ggplot(aes(Year, mean_wt_kg)) +geom_line(aes(group = area_titles), linewidth =1,show.legend = F,color ="gray60") +geom_segment(data = avg_size_decades,aes(x = decade_left, xend = decade_right, y = mean_wt_kg, yend = mean_wt_kg), linewidth =1.5) +geom_text(data = avg_size_decades,aes(x = decade_center, y = mean_wt_kg, label =str_c(round(mean_wt_kg, 1)) ), vjust =-1.05, size =4, fontface ="bold") +scale_color_gmri() +facet_wrap(~area_titles, ncol =1) +labs(title ="Average Weight",y ="Weight (kg)",color ="Region")# Plot side by side(avg_len_p | avg_wt_p) ```#### Functional Group Avg. Length```{r}#| label: functional-group-length-trends#| fig.width: 8#| fig-height: 8# Plot the different trends in lengthfunctional_group_size %>%ggplot(aes(Year, mean_len_cm)) +geom_line(aes(group = area_titles#,#color = area_copy ), linewidth =1,show.legend = F) +geom_smooth(color ="orange", method ="loess", linewidth =1) +facet_grid(hare_group~area_titles, scales ="free_y") +theme(axis.text =element_text(size =8)) +labs(title ="Average Length",y ="Length (cm)",color ="Region")```#### Functional Group Avg. Weight```{r}#| label: functional-group-weight-trends#| fig.width: 8#| fig-height: 8# Average individual weightfunctional_group_size %>%ggplot(aes(Year, mean_wt_kg)) +geom_line(aes(group = area_titles), linewidth =1,show.legend = F) +geom_smooth(color ="orange", method ="loess", linewidth =1, formula ='y ~ x') +facet_grid(hare_group~area_titles, scales ="free_y") +theme(axis.text =element_text(size =8)) +labs(title ="Average Weight",y ="Weight (kg)",color ="Region")```:::### Regional Size SpectraAt the start of our time series, back in the 1970's, there was a clear difference in the relative positions of spectra parameters among the different regions. Gulf of Maine and Georges Bank showed the least steep spectra slopes in the earlier time periods with slopes around -1 & -1.1 respectively. The relatively flat slopes in these regions both steepened over time, settling near -1.3 (GoM) and -1.5 (GB). Gulf of Maine experienced much of its decline during the 1980's and 1990's. There was a brief reversal in this trend during the 2000's, but slopes continued to steepen by 2010 and remained steep through 2019. Georges Bank did not experience as rapid of a decline, but experienced a similar long-term steepening. In contrast to the northern regions, SNE and MAB had steeper slopes in the -1.2 to -1.5 territory. The long term pattern for SNE was one of increasing volatility, but not so much a decline. The spectra slope for the MAB was less volatile, but similarly maintained a relatively stable wander around -1.4. By the end of the study period all regions had slopes that were at or near a similar level.```{r}#| label: size spectra results# Make decadal averages of spectrums slopesindices_decades <- region_indices %>%mutate(Year =as.numeric(Year),decade_left =as.numeric(as.character(floor_decade(Year))),decade_right =as.numeric(as.character(floor_decade(Year +10))) -1,decade_center = (decade_left + decade_right)/2,decade_lab =str_c(decade_left, "-", decade_right)) %>%group_by(area_titles, decade_left, decade_right, decade_center, decade_lab) %>%summarise( mean_b =mean(b, na.rm = T),mean_l2slope =mean(log2_slope_strat, na.rm = T), .groups ="drop")# 1. Plot the Regional Slopes, using the binned methodslope_timeline <- region_indices %>%ggplot(aes(yr, log2_slope_strat, group = area_titles)) +# geom_line(data = select(region_indices, -area_titles),# aes(group = area_copy),# alpha = 0.2, linewidth = 0.5) +geom_line(aes(group = area_titles), linewidth =1, color ="gray70") +geom_segment(data = indices_decades,aes(x = decade_left, xend = decade_right, y = mean_l2slope, yend = mean_l2slope), linewidth =1.5) +geom_text(data = indices_decades,aes(x = decade_center, y = mean_l2slope, label =str_c(round(mean_l2slope, 1)) ), vjust =-1.05, size =4, fontface ="bold") +scale_color_gmri() +facet_wrap(~area_titles, ncol =1) +labs(x ="Year", y ="Normalized Abundance Spectrum Slope") # 2. Exponent of ISD, MLE methodisd_timeline <- region_indices %>%ggplot(aes(yr, b, group = area_titles)) +# geom_line(data = select(region_indices, -area_titles),# aes(group = area_copy),# alpha = 0.2, linewidth = 0.5) +geom_line(aes(group = area_titles), color ="gray60", linewidth =1) +#geom_pointrange(aes(group = area_titles, ymin = confMin, ymax = confMax), size = 0.1) +geom_segment(data = indices_decades,aes(x = decade_left, xend = decade_right, y = mean_b, yend = mean_b), linewidth =1.5) +geom_text(data = indices_decades,aes(x = decade_center, y = mean_b, label =str_c(round(mean_b, 1)) ), vjust =-1.05, size =4, fontface ="bold") +scale_color_gmri() +facet_wrap(~area_titles, ncol =1) +labs(x ="Year", y ="Abundance Size Spectrum Slope (b)")```::: panel-tabset#### Exponent of Size Spectra```{r}#| fig.height: 8#| fig.weight: 8isd_timeline #+ geom_smooth(color = "orange", se = FALSE)```#### Binned Size Spectra```{r}#| fig.height: 8#| fig.weight: 8slope_timeline #+ geom_smooth(color = "orange", se = FALSE)```#### Contrast```{r}#| fig.height: 8#| fig.weight: 8# Assemble the figureslope_timeline +scale_y_continuous(limits =c(-1.35, -0.3)) | isd_timeline +scale_y_continuous(limits =c(-1.35, -0.3)) ```#### Method Correlation```{r}#| fig.height: 8#| fig.weight: 8region_indices %>%ggplot(aes(log2_slope_strat, b)) +geom_point() +geom_smooth(method ="lm", formula = y~x, color ="orange") +facet_wrap(~area_titles, ncol =1) +labs(x ="Normalized Abundance Spectra Slope", y ="Individual Size Distribution Exponent (b)", title ="Between Methods Correlation")```:::## Size Spectra Drivers## Spectra Slope changepoints and Regression AnalysisFollowing break point exploratory analyses with the EnvCpt package, the Gulf of Maine was the only region from which changepoints were identified in the best supported model from the candidate changepoint model structures. Changes in Gulf of Maine's spectrum slope was best modeled by a model allowing breaks in the trend, and with an auto-regressive process on the previous 2 years. From this model, breaks were identified at 1999 and 2007. Georges Bank was best modeled with a single trend, with no changepoints. Southern New-England showed the most support for a single intercept model, with no changepoints, with values centered on mu = -1.04 & sd = 0.046. The Mid-Atlantic Bight was best supported by a single trend model, with no changepoints, and an increasing intercept through the study period.```{r}# Pull Regions to test independently:regression_df <- regression_df %>%rename(landings =`Commercial Landings`,b =`ISD Exponent`)gom_df <-filter(regression_df, spectra_area =="Gulf of Maine") gb_df <-filter(regression_df, spectra_area =="Georges Bank") %>%drop_na()sne_df <-filter(regression_df, spectra_area =="Southern New England") mab_df <-filter(regression_df, spectra_area =="Mid-Atlantic Bight") %>%drop_na()# Check each for different changepoint structuresgom_cpt <-envcpt(gom_df$b, verbose = F)gb_cpt <-envcpt(gb_df$b, verbose = F)sne_cpt <-envcpt(sne_df$b, verbose = F)mab_cpt <-envcpt(mab_df$b, verbose = F)# Pulling out details of best supported modelcpt_res <-list("gom"=list("cpt_res"= gom_cpt),"gb"=list("cpt_res"= gb_cpt),"sne"=list("cpt_res"= sne_cpt),"mab"=list("cpt_res"= mab_cpt)) %>%map(function(x){ best <-names(which.min(AIC(x$cpt_res))) best_summ <- x$cpt_res[[best]] x$best <- best x$best_summ <- best_summreturn(x)})# Plot GOM, the only one with changepointsplot(gom_cpt, main ="Gulf of Maine, Break Point Analysis")# Put the rest of the results into a delta AIC table:bind_rows(map(cpt_res, ~tibble("Most Supported Model"= .x[["best"]])), .id ="Region") %>%kable()```### Driver Differences Across Regimeschangepoint analyses suggest there were as many as 3 distinct regimes in the Gulf of Maine slope changes. For all other regions there was not strong support for changepoints.```{r}#| label: period-correlations-processing# Run Correlations for Each period and for the entire perioddriver_period_correlations <-map(driver_period_list, function(drivers){#------ debug copy-paste line# Step 1. Get the correlation matrix (data is pre-scaled) corr_matrix <-rcorr(as.matrix(drivers))# R-squares rho_df <-as.data.frame(corr_matrix$r) %>%rownames_to_column(var ="var1") %>%pivot_longer(cols =-var1, names_to ="var2", values_to ="r2")# Significance Dataframe sig_df <-as.data.frame(corr_matrix$P) %>%rownames_to_column(var ="var1") %>%pivot_longer(cols =-var1, names_to ="var2", values_to ="pval")# Join corr_both <-left_join(rho_df, sig_df, by =c("var1", "var2")) # Step 2. Tidy Up for ggplot# Make correlation matrix & significance a dataframe for GGplot# Goal: Flag the different Functional Group size changes somehow# Mega Cleanup Code Chunk# This first step adds metadata, but mostly creates a way to filter out correlations to be in the direction we want. no sst as response variables for example# GOAL: All the drivers in one column, and all the responses in another, order doesn't matter corr_tidy <- corr_both %>%mutate(# 1.These flag whether or not the matrix variables in each column are the independent or dependent variablesis_var1_driver =ifelse(str_detect(var1, "landings|index|anom|avg|zp"), T, F),is_var2_response =ifelse(str_detect(var2, "slope|int"), T, F),# 2. Column to Flag what type of driver it is:driver_type =case_when(str_detect(var1, "landings") ~"Top-Down",str_detect(var1, "stream|anom|zp") ~"Bottom-Up",str_detect(var1, "length") ~"Avg. Length",str_detect(var1, "weight") ~"Avg. Weight"),# 3. Column to Flag what area the driver data is fromdriver_area =case_when(str_detect(var1, "All") ~"Northeast Shelf, Full",str_detect(var1, "Georges") ~"Georges Bank",str_detect(var1, "Gulf") ~"Gulf of Maine",str_detect(var1, "Southern") ~"Southern New England",str_detect(var1, "Mid-Atlantic") ~"Mid-Atlantic Bight"),# 4. Set Factor Levels for plotdriver_area =factor( driver_area, levels =c("All", "Gulf of Maine", "Georges Bank", "Southern New England", "Mid-Atlantic Bight")),# 5. Shorthand Names for the Spectra Parameters - Responseresponse_short =case_when(str_detect(var2, "int") ~"Spectra Intercept",str_detect(var2, "isd") ~"ISD Exponent",str_detect(var2, "slope") ~"Spectra Slope"),# 6. These are the areas of the driversresponse_area =case_when(str_detect(var2, "All") ~"All",str_detect(var2, "Georges") ~"Georges Bank",str_detect(var2, "Gulf") ~"Gulf of Maine",str_detect(var2, "Southern") ~"Southern New England",str_detect(var2, "Mid-Atlantic") ~"Mid-Atlantic Bight"),# 7. Set levels for the driver areasresponse_area =factor( response_area, levels =c("All", "Gulf of Maine", "Georges Bank", "Southern New England", "Mid-Atlantic Bight")),# 8. Flag when the areas are the same or not:area_match = driver_area == response_area )# Now we filter rows out to focus in on:# 1. one direction driver & response# 2. response and driver in same area, unless its GSIcorr_tidy_pairs <- corr_tidy %>%filter( is_var2_response, is_var1_driver, area_match |str_detect(var1, "index")) # From here it is just text formatting for plots:# Because we are faceting we can have repeated names to simplify thingscorr_tidy_pairs <- corr_tidy_pairs %>%mutate(var1 =ifelse(str_detect(var1, "landings"), "Commercial_Landings", var1),var1 =ifelse(str_detect(var1, "sst"), "SST_Anomalies", var1),var1 =ifelse(str_detect(var1, "gulf_stream_index"), "Gulf Stream Index", var1),var1 =ifelse(str_detect(var1, "zp_small"), "Small Zooplankton", var1),var1 =ifelse(str_detect(var1, "zp_large"), "Large Zooplankton", var1),var1 =str_replace_all(var1, "_", " "),# Make a functional group columnhare_group =case_when(str_detect(var1, "elasmobranch") ~"Elasmobranch",str_detect(var1, "pelagic") ~"Pelagic",str_detect(var1, "coastal") ~"Coastal",str_detect(var1, "diadromous") ~"Diadromous",str_detect(var1, "groundfish") ~"Groundfish",str_detect(var1, "reef") ~"Reef",TRUE~""),# Add the functional group column ahead of the average length and weight columnsvar1 =ifelse(str_detect(var1, "avg"), hare_group, var1),# Set The Factor Levels groupsvar1 =factor( var1,levels =c("Gulf Stream Index","SST Anomalies","Small Zooplankton","Large Zooplankton","Groundfish","Elasmobranch","Coastal","Diadromous","Pelagic","Reef","Commercial Landings")),# Flag Significance Levelssignif =ifelse(pval <0.05, "*", ""))return(corr_tidy_pairs)#------- })```::: panel-tabset#### All Years```{r}#| label: correlations-all#| fig-height: 6#| fig-width: 8period <-"1982-2019"plot_period_correlations(driver_period_correlations[[period]], period)```#### 1982-1998```{r}#| label: correlations-80s#| fig-height: 6#| fig-width: 8period <-"1982-1998"plot_period_correlations(driver_period_correlations[[period]], period)```#### 1999-2006```{r}#| label: correlations-90s#| fig-height: 6#| fig-width: 8period <-"1999-2006"plot_period_correlations(driver_period_correlations[[period]], period)```#### 2007-2019```{r}#| label: correlations-20s#| fig-height: 6#| fig-width: 8period <-"2007-2019"plot_period_correlations(driver_period_correlations[[period]], period)```:::## Spectra Slope Regression Analysis::: panel-tabset### Gulf of Maine```{r}#| label: gom-standard-models# change na. action for dredgeoptions(na.action ="na.fail")# Incorporate the patterns from the changepoint analysis# What if we enforce the break_periodsgom_df <-mutate(gom_df, regime =case_when( year <1999~"yrs_1982-1998", year <2007~"yrs_1999_2006", year >=2007~"yrs_2007_2019" ))# Add the autoregressive predictors neatly/manuallygom_df_lag <- gom_df %>%mutate(blag1 = dplyr::lag(b, 1),blag2 = dplyr::lag(b, 2),zpslag1 = dplyr::lag(zp_small, 1),zpslag2 = dplyr::lag(zp_small, 2)) %>%drop_na(blag2)# kitchen sink Autoregressive modelgom_ar_models =lm( b ~ regime * SST + regime * GSI + regime * landings + regime * zp_small + regime * zp_large + regime * blag1 + regime * blag2, data = gom_df_lag)# Model selection on models with lags at 1+2 yearsresults_gom <-dredge(gom_ar_models)# Best autoregressive modelbest_gom <-get.models(results_gom, subset = delta ==0)[[1]]```::: panel-tabset#### Model Selection Table```{r}#| label: gom-top-models-table#----------------------# Okay quick issues, for non-stationary effects, maybe just take the models under delta = 2# and from them pull the summary# Pull details for models with delta AIC < 2top_mods_gom <-get.models(results_gom, subset = delta <2)# Peformance, Gets the AIC order top_perf <-map_dfr(top_mods_gom, glance, .id ="id") %>%arrange(AIC)top_perf$Delta <- top_perf$AIC -min(top_perf$AIC)top_perf <- top_perf %>%arrange(Delta) %>%mutate(`Model ID`=row_number(),`Model ID`=str_c("Model ", `Model ID`))# Top Model Coefficients, join in the model number based on AIC ordertop_coef <-map_dfr(top_mods_gom, tidy, .id ="id") %>%left_join(select(top_perf, id, `Model ID`)) %>%select(-id)# Reshape Performance Specs to bind_row latertop_perf <-select(top_perf, `Model ID`, r2 = r.squared, Delta) %>%pivot_longer(cols =-1, names_to ="term", values_to ="estimate") %>%mutate(fill_color ="lightgray")# Clean up the data and put it into a graphtop_coef %>%# Tidy up the names and set the color scheme for fillmutate(term =ifelse(term =="landings", "Commercial Landings", term),term =ifelse(term =="zp_small", "Small Zooplankton", term),term =ifelse(term =="zp_large", "Large Zooplankton", term),term =ifelse(term =="blag2", "2-yr-AR", term),term =str_remove_all(term, "regime"),term =str_remove_all(term, "yrs_"),term =str_replace_all(term, ":", " ~ "),fill_color =ifelse(estimate<0, "#EACA00", "#00736D")) %>%# Join in the Performance Details and set Factor Ordersbind_rows(top_perf) %>%mutate(survey_area ="Gulf of Maine",`Model ID`=fct_rev(`Model ID`),estimate =round(estimate, 2)) %>%# Set the different types of estimates heremutate(type =case_when( term =="1999_2006"~"Regime\nIntercept", term =="2007_2019"~"Regime\nIntercept", term =="2-yr-AR"~"Fixed\nEffect", term =="r2"~"Model\nPerformance", term =="Delta"~"Model\nPerformance",str_detect(term," ~ ") ~"Regime\nInteraction",TRUE~"Fixed\nEffects"),type =factor(type, levels =c("Fixed\nEffects", "Regime\nIntercept", "Regime\nInteraction", "Model\nPerformance"))) %>%ggplot( aes(term, `Model ID`)) +geom_tile(aes(fill =I(fill_color)), color ="gray20", linewidth =0.2, alpha =0.75) +geom_text(aes(label = estimate)) +facet_grid(survey_area ~ type, scales ="free", space ="free") +theme(axis.text.x =element_text(angle =45, hjust =1, vjust =1)) +labs(x ="Model Predictor",y ="",title ="Gulf of Maine Size Spectra Driver Exploration",subtitle ="Interaction terms should be added to the corresponding fixed effect to determine net-influence." )```#### Model Summary```{r}# output summary - broom.helpers and gtsummary# Best Modeltbl1 <- best_gom %>%tbl_regression() %>%add_glance_table() %>%bold_labels()tbl1```#### VIF```{r}# make a model with all of themall_parms <-lm( b ~ regime * SST + regime * GSI + regime * landings + regime * zp_small + regime * zp_large, data = gom_df)# Check VIFvif(all_parms, type ="predictor") %>%as.data.frame() %>%rownames_to_column(var ="Predictor") %>%gt() ```#### Actual vs. Fitted```{r}augment(best_gom, gom_df_lag ) %>%# augment(best_gom, gom_df) %>% ggplot(aes(year)) +geom_line(aes(y = b, color ="Actual"), linewidth =1) +geom_line(aes(y = .fitted, color ="Predicted"), linewidth =1) +scale_color_gmri() +labs(color ="Color", y ="Spectrum Slope")```#### Residual Autocorrelation```{r}augment(best_gom, gom_df_lag ) %>%ggplot(aes(year, y = .resid)) +geom_line(linewidth =1, color ="gray30") +geom_point(size =2, color ="gray30") +geom_hline(yintercept =0, linetype =2, color ="darkred", linewidth =1) +scale_color_gmri() +labs(color ="Color", y ="Model Residuals")```:::### Georges BankExploratory analyses for Georges Bank indicated that the structure of the data did not show support for clear changepoints. CCF results of individual predictors showed that spectrum slope was correlated with small zooplankton index values from two years prior. This relationship was added to the standard predictor candidates, each operating at a simultaneous time step.```{r}# Georges Bank showedgb_df_lag <- gb_df %>%mutate(b = b,blag1 = dplyr::lag(b, 1),blag2 = dplyr::lag(b, 2),zpslag1 = dplyr::lag(zp_small, 1),zpslag2 = dplyr::lag(zp_small, 2)) %>%drop_na(blag2)# fit model with all parametersall_parms <-lm(b ~ SST + GSI + landings + zp_small + zp_large + zpslag1 + zpslag2, data = gb_df_lag)# Refine the dredgeresults <-dredge(all_parms)#grab best model# subset(results, delta == 0)bmod <-get.models(results, subset = delta ==0)[[1]]```::: panel-tabset#### Model Selection Table```{r}#| label: gb-top-models-table#----------------------# Pull details for models with delta AIC < 2top_mods_gb <-get.models(results, subset = delta <2)# Peformance, Gets the AIC order top_perf <-map_dfr(top_mods_gb, glance, .id ="id") %>%arrange(AIC)top_perf$Delta <- top_perf$AIC -min(top_perf$AIC)top_perf <- top_perf %>%arrange(Delta) %>%mutate(`Model ID`=row_number(),`Model ID`=str_c("Model ", `Model ID`))# Top Model Coefficients, join in the model number based on AIC ordertop_coef <-map_dfr(top_mods_gb, tidy, .id ="id") %>%left_join(select(top_perf, id, `Model ID`)) %>%select(-id)# Reshape Performance Specs to bind_row latertop_perf <-select(top_perf, `Model ID`, r2 = r.squared, Delta) %>%pivot_longer(cols =-1, names_to ="term", values_to ="estimate") %>%mutate(fill_color ="lightgray")# Clean up the data and put it into a graphtop_coef %>%# Tidy up the names and set the color scheme for fillmutate(term =ifelse(term =="landings", "Commercial Landings", term),term =ifelse(term =="zp_small", "Small Zooplankton", term),term =ifelse(term =="zp_large", "Large Zooplankton", term),term =ifelse(term =="zpslag2", "Small Zooplankton 2-yr Lag", term),fill_color =ifelse(estimate<0, "#EACA00", "#00736D")) %>%# Join in the Performance Details and set Factor Ordersbind_rows(top_perf) %>%mutate(survey_area ="Georges Bank",`Model ID`=fct_rev(`Model ID`),estimate =round(estimate, 2)) %>%# Set the different types of estimates heremutate(type =case_when( term =="r2"~"Model\nPerformance", term =="Delta"~"Model\nPerformance",str_detect(term,":") ~"Regime\nInteraction",TRUE~"Fixed\nEffects"),type =factor(type, levels =c("Fixed\nEffects", "Model\nPerformance"))) %>%ggplot( aes(term, `Model ID`)) +geom_tile(aes(fill =I(fill_color)), color ="gray20", linewidth =0.2, alpha =0.75) +geom_text(aes(label = estimate), size =4) +facet_grid(survey_area ~ type, scales ="free", space ="free") +theme(axis.text.x =element_text(angle =45, hjust =1, vjust =1)) +labs(x ="Model Predictor",y ="",title ="Georges Bank Size Spectra Driver Exploration",subtitle ="Interaction terms should be added to the corresponding fixed effect to determine net-influence." )```#### Model Summary```{r}# output summary - brrom.helpers and gtsummary# Best Modeltbl1 <- bmod %>%tbl_regression() %>%add_glance_table() %>%bold_labels()tbl1```#### VIF```{r}all_parms <-lm(b ~ SST + GSI + landings + zp_small + zp_large + zpslag1 + zpslag2, data = gb_df_lag)vif(all_parms) %>%as.data.frame() %>%rownames_to_column(var ="Predictor") %>%rename(VIF =".") %>%arrange(VIF) %>%gt() ```#### Actual vs. Fitted```{r}augment(bmod, gb_df_lag) %>%ggplot(aes(year)) +geom_line(aes(y = b, color ="Actual"), linewidth =1) +geom_line(aes(y = .fitted, color ="Predicted"), linewidth =1) +scale_color_gmri() +labs(color ="Color", y ="Spectrum Slope")```#### Residual Autocorrelation```{r}augment(bmod, gb_df_lag) %>%ggplot(aes(year, y = .resid)) +geom_line(linewidth =1, color ="gray30") +geom_point(size =2, color ="gray30") +geom_hline(yintercept =0, linetype =2, color ="darkred", linewidth =1) +scale_color_gmri() +labs(color ="Color", y ="Model Residuals")```:::### Southern New England```{r}# fit model with all parametersall_parms <-lm(b ~ SST + GSI + landings + zp_small + zp_large, data = sne_df)# Refine the dredgeresults <-dredge(all_parms)#grab best model# subset(results, delta == 0)bmod <-get.models(results, subset = delta ==0)[[1]]```::: panel-tabset#### Model Selection Table```{r}#| label: sne-top-models-table#----------------------# Pull details for models with delta AIC < 2top_mods_sne <-get.models(results, subset = delta <2)# Peformance, Gets the AIC order top_perf <-map_dfr(top_mods_sne, glance, .id ="id") %>%arrange(AIC)top_perf$Delta <- top_perf$AIC -min(top_perf$AIC)top_perf <- top_perf %>%arrange(Delta) %>%mutate(`Model ID`=row_number(),`Model ID`=str_c("Model ", `Model ID`))# Top Model Coefficients, join in the model number based on AIC ordertop_coef <-map_dfr(top_mods_sne, tidy, .id ="id") %>%left_join(select(top_perf, id, `Model ID`)) %>%select(-id)# Reshape Performance Specs to bind_row latertop_perf <-select(top_perf, `Model ID`, r2 = r.squared, Delta) %>%pivot_longer(cols =-1, names_to ="term", values_to ="estimate") %>%mutate(fill_color ="lightgray")# Clean up the data and put it into a graphtop_coef %>%# Tidy up the names and set the color scheme for fillmutate(term =ifelse(term =="landings", "Commercial Landings", term),term =ifelse(term =="zp_small", "Small Zooplankton", term),term =ifelse(term =="zp_large", "Large Zooplankton", term),fill_color =ifelse(estimate<0, "#EACA00", "#00736D")) %>%# Join in the Performance Details and set Factor Ordersbind_rows(top_perf) %>%mutate(survey_area ="Southern New England",`Model ID`=fct_rev(`Model ID`),estimate =round(estimate, 2)) %>%# Set the different types of estimates heremutate(type =case_when( term =="r2"~"Model\nPerformance", term =="Delta"~"Model\nPerformance",str_detect(term,":") ~"Regime\nInteraction",TRUE~"Fixed\nEffects"),type =factor(type, levels =c("Fixed\nEffects", "Model\nPerformance"))) %>%ggplot( aes(term, `Model ID`)) +geom_tile(aes(fill =I(fill_color)), color ="gray20", linewidth =0.2, alpha =0.75) +geom_text(aes(label = estimate)) +facet_grid(survey_area ~ type, scales ="free", space ="free") +theme(axis.text.x =element_text(angle =45, hjust =1, vjust =1)) +labs(x ="Model Predictor",y ="",title ="Southern New England Size Spectra Driver Exploration",subtitle ="Interaction terms should be added to the corresponding fixed effect to determine net-influence." )```#### Model Summary```{r}# output summary - brrom.helpers and gtsummary# Best Modeltbl1 <- bmod %>%tbl_regression() %>%add_glance_table() %>%bold_labels()tbl1```#### VIF```{r}all_parms <-lm(b ~ SST + GSI + landings + zp_small + zp_large, data = sne_df)vif(all_parms) %>%as.data.frame() %>%rownames_to_column(var ="Predictor") %>%rename(VIF =".") %>%arrange(VIF) %>%arrange(VIF) %>%gt() ```#### Actual vs. Fitted```{r}augment(bmod, sne_df ) %>%ggplot(aes(year)) +geom_line(aes(y = b, color ="Actual"), linewidth =1) +geom_line(aes(y = .fitted, color ="Predicted"), linewidth =1) +scale_color_gmri() +labs(color ="Color", y ="Spectrum Slope")```#### Residual Autocorrelation```{r}augment(bmod, sne_df ) %>%ggplot(aes(year, y = .resid)) +geom_line(linewidth =1, color ="gray30") +geom_point(size =2, color ="gray30") +geom_hline(yintercept =0, linetype =2, color ="darkred", linewidth =1) +scale_color_gmri() +labs(color ="Color", y ="Model Residuals")```:::### Mid-Atlantic Bight```{r}# fit model with all parametersall_parms <-lm(b ~ SST + GSI + landings + zp_small + zp_large, data = mab_df)# Refine the dredgeresults <-dredge(all_parms)#grab best model# subset(results, delta == 0)bmod <-get.models(results, subset = delta ==0)[[1]]```::: panel-tabset#### Model Selection Table```{r}#| label: mab-top-models-table#----------------------# Pull details for models with delta AIC < 2top_mods_mab <-get.models(results, subset = delta <2)# Peformance, Gets the AIC order top_perf <-map_dfr(top_mods_mab, glance, .id ="id") %>%arrange(AIC)top_perf$Delta <- top_perf$AIC -min(top_perf$AIC)top_perf <- top_perf %>%arrange(Delta) %>%mutate(`Model ID`=row_number(),`Model ID`=str_c("Model ", `Model ID`))# Top Model Coefficients, join in the model number based on AIC ordertop_coef <-map_dfr(top_mods_mab, tidy, .id ="id") %>%left_join(select(top_perf, id, `Model ID`)) %>%select(-id)# Reshape Performance Specs to bind_row latertop_perf <-select(top_perf, `Model ID`, r2 = r.squared, Delta) %>%pivot_longer(cols =-1, names_to ="term", values_to ="estimate") %>%mutate(fill_color ="lightgray")# Clean up the data and put it into a graphtop_coef %>%# Tidy up the names and set the color scheme for fillmutate(term =ifelse(term =="landings", "Commercial Landings", term),term =ifelse(term =="zp_small", "Small Zooplankton", term),term =ifelse(term =="zp_large", "Large Zooplankton", term),fill_color =ifelse(estimate<0, "#EACA00", "#00736D")) %>%# Join in the Performance Details and set Factor Ordersbind_rows(top_perf) %>%mutate(survey_area ="Mid-Atlantic Bight",`Model ID`=fct_rev(`Model ID`),estimate =round(estimate, 2)) %>%# Set the different types of estimates heremutate(type =case_when( term =="r2"~"Model\nPerformance", term =="Delta"~"Model\nPerformance",str_detect(term,":") ~"Regime\nInteraction",TRUE~"Fixed\nEffects"),type =factor(type, levels =c("Fixed\nEffects", "Model\nPerformance"))) %>%ggplot( aes(term, `Model ID`)) +geom_tile(aes(fill =I(fill_color)), color ="gray20", linewidth =0.2, alpha =0.75) +geom_text(aes(label = estimate)) +facet_grid(survey_area ~ type, scales ="free", space ="free") +theme(axis.text.x =element_text(angle =45, hjust =1, vjust =1)) +labs(x ="Model Predictor",y ="",title ="Mid-Atlantic Bight Driver Exploration",subtitle ="Interaction terms should be added to the corresponding fixed effect to determine net-influence." )```#### Model Summary```{r}# output summary - brrom.helpers and gtsummary# Best Modeltbl1 <- bmod %>%tbl_regression() %>%add_glance_table() %>%bold_labels()tbl1```#### VIF```{r}all_parms <-lm(b ~ SST + GSI + landings + zp_small + zp_large, data = mab_df)vif(all_parms) %>%as.data.frame() %>%rownames_to_column(var ="Predictor") %>%rename(VIF =".") %>%arrange(VIF) %>%gt() ```#### Actual vs. Fitted```{r}augment(bmod, mab_df ) %>%ggplot(aes(year)) +geom_line(aes(y = b, color ="Actual"), linewidth =1) +geom_line(aes(y = .fitted, color ="Predicted"), linewidth =1) +scale_color_gmri() +labs(color ="Color", y ="Spectrum Slope")```#### Residual Autocorrelation```{r}augment(bmod, mab_df ) %>%ggplot(aes(year, y = .resid)) +geom_line(linewidth =1, color ="gray30") +geom_point(size =2, color ="gray30") +geom_hline(yintercept =0, linetype =2, color ="darkred", linewidth =1) +scale_color_gmri() +labs(color ="Color", y ="Model Residuals") ```::::::### Potential Drivers Timeseries:The following panels show the historical changes in each of the drivers. Landings have been scaled by average total landings within each region across all years. SST and GSI have not been scaled. This is different from how they are implemented in the regression analyses, when the landings were scaled over the 1982-2019 period.::: panel-tabset#### Surface Temperature Anomalies```{r}#| label: temperature-index#| fig.height: 8# Plot the timelinestemp_regimes %>%mutate(anom_direction =ifelse(sst_anom >0, "Positive", "Negative")) %>%ggplot( aes(yr, sst_anom)) +geom_col(aes(fill = anom_direction), linewidth =0.75, alpha =0.8) +geom_hline(yintercept =0, linewidth =0.5, lty =1) +scale_x_continuous(expand =expansion(add =c(0.25,0.25))) +scale_y_continuous(labels =number_format(suffix =" \u00b0C")) +scale_fill_gmri() +facet_wrap(~survey_area, ncol =1) +theme(legend.title =element_blank(),legend.background =element_rect(fill ="transparent"),legend.key =element_rect(fill ="transparent", color ="transparent")) +labs(title ="NW-Atlantic Regional SST Anomalies",x ="", y ="Surface Temperature Anomaly",caption ="Anomalies calculated using 1982-2011 reference period.") +guides(fill ="none",label ="none",color =guide_legend(keyheight =unit(0.5, "cm")))```#### GARFO Landings```{r}#| label: landings-index#| fig.height: 8#### Landings ####landings_summ %>%mutate(anom_direction =ifelse(total_wt_z >0, "Positive", "Negative")) %>%ggplot(aes(year, total_wt_z, group = survey_area)) +geom_col(aes(fill = anom_direction), linewidth =0.75, alpha =0.8) +geom_hline(yintercept =0, linewidth =0.5, lty =1) +scale_fill_gmri() +facet_wrap(~survey_area, ncol =1) +scale_x_continuous(breaks =seq(1950, 2030, by =10)) +labs(x ="Year", y ="Finfish Landings Index (z)",title ="GARFO Finfish Landings") +guides(fill ="none")```#### Climate Modes```{r}#| label: climate-modes#### Climate Drivers ####clim_drivers %>%group_by(year = lubridate::year(Time), Var, EPU) %>%summarise(Value =mean(Value, na.rm = T)) %>%mutate(anom_direction =ifelse(Value >0, "Positive", "Negative"),Var =str_to_title(Var)) %>%ggplot(aes(year, Value, group = EPU)) +geom_col(aes(fill = anom_direction), linewidth =0.75, alpha =0.8) +geom_hline(yintercept =0, linewidth =0.5, lty =1) +scale_fill_gmri() +facet_wrap(~Var, ncol =1) +scale_y_continuous(labels = scales::comma_format()) +scale_x_continuous(limits =c(1960, 2020), breaks =seq(1950, 2030, by =10)) +labs(x ="Year", y ="Index",title ="Environmental Drivers") +guides(fill ="none")```#### Zooplankton Indices```{r}zp_index %>%pivot_longer(cols =starts_with("zp"), names_to ="Var", values_to ="Value") %>%group_by(year, Var, survey_area) %>%summarise(Value =mean(Value, na.rm = T)) %>%mutate(anom_direction =ifelse(Value >0, "Positive", "Negative"),Var =ifelse( Var =="zp_large", "Large Zooplankton", "Small Zooplankton"),survey_area =factor(survey_area, levels = area_levels_long)) %>%ggplot(aes(year, Value, group = survey_area)) +geom_col(aes(fill = anom_direction), linewidth =0.75, alpha =0.8) +geom_hline(yintercept =0, linewidth =0.5, lty =1) +scale_fill_gmri() +facet_grid(survey_area~Var) +scale_y_continuous(labels = scales::comma_format()) +scale_x_continuous(limits =c(1960, 2020), breaks =seq(1950, 2030, by =10)) +labs(x ="Year", y ="Index",title ="Environmental Drivers") +guides(fill ="none")```:::# Supplemental Materials::: panel-tabset## Species Functional Groups```{r}#| label: tbl-functional-groups#| tbl-cap: "Common and scientific names for the species that constitute each functional group used in our analyses. X markers are used to indicate which regions each species has been caught in the data."# Tidy up a key of what is labeled whatspec_key <- catch_size_bins %>%distinct(comname, scientific_name, hare_group) %>%mutate(hare_group =ifelse(hare_group %in%c("elasmobranch", "groundfish"), hare_group, str_c(hare_group)),hare_group =ifelse(is.na(hare_group), "Not Classified", hare_group),hare_group =str_to_title(hare_group),hare_group =fct_drop(hare_group),hare_group =factor(hare_group, levels =c("Coastal", "Diadromous", "Elasmobranch", "Groundfish", "Pelagic", "Reef", "Not Classified"))) %>%group_by(hare_group) %>%mutate(n =str_c("(",n(),")")) %>%ungroup()# Reformat the data to check presence absencecatch_size_bins %>%distinct(comname, survey_area) %>%mutate(Presence ="X") %>%complete(comname, survey_area) %>%left_join(spec_key, by ="comname") %>%mutate(comname =str_to_title(comname),Presence =ifelse(is.na(Presence) ==FALSE, "X", ""),Presence =ifelse(is.na(Presence) ==TRUE, "", Presence)) %>%left_join(area_df, by ="survey_area") %>%select(hare_group, n, comname, scientific_name, area, Presence) %>%pivot_wider(names_from = area, values_from = Presence) %>%arrange(hare_group, comname) %>%group_by(hare_group,n) %>%gt() %>% gt::tab_header(title =md("**Functional Group Assignments and Regional Presence/Absence**")) %>%tab_stubhead(label =md("**Functional Group**")) %>%cols_label(comname =md("*Common Name*"),scientific_name =md("*Scientific Name*"),`Gulf of Maine`=md("*Gulf of Maine*"),`Georges Bank`=md("*Georges Bank*"),`Southern New England`=md("*Southern New England*"),`Mid-Atlantic Bight`=md("*Mid-Atlantic Bight*") ) %>%tab_style(style =list(cell_text(weight ="bold")),locations =cells_row_groups()) %>% gt::tab_source_note(source_note =md("*Functional group assignments adapted from Hare et al. 2010*"))```## GARFO Landings Summary```{r}#| label: garfo-landings-summary# Add the labels into the landings data and remove what we don't need there:landings %>%filter(between(year, 1960, 2019),!str_detect(sppname, "CONFIDENTIAL")) %>%mutate(decade =floor_decade(year),sppname =str_to_title(sppname)) %>%group_by(survey_area, decade, sppname) %>%summarise(avg_landings_lb =mean(`landed lbs`, na.rm = T),across(.cols =c(landings_lb =`landed lbs`, value = value), .fns = sum, .names ="total_{.col}"),.groups ="drop") %>%group_by(survey_area, decade) %>%slice_max(order_by = total_landings_lb, n =3) %>%gt() %>% gt::tab_header(title =md("**Top Commercial Fisheries Landings of Northeastern US (by weight)**")) %>%tab_stubhead(label =md("*Harvest Region*")) %>%fmt_number(columns =c(avg_landings_lb, total_landings_lb, total_value),use_seps = T, sep_mark =",",suffixing = T) %>%tab_style(style =list(cell_text(weight ="bold")),locations =cells_row_groups()) %>%cols_label(decade =md("*Decade*"),sppname =md(""),avg_landings_lb =md("*Avg. Annual Landings (lb.)*"),total_landings_lb =md("*Total Landings (lb.)*"),total_value =md("*Total Value ($)*")) %>% gt::tab_source_note(source_note =md("*Landings data obtained from the Greater Atlantic Regional Fishing Office (GARFO)*"))```:::```{r}#| label: weisberg-synthesis#| eval: false# Stuff for Sara:# 1. Plot of Gulf of Maine Slopes, no gray linessara_plot <- region_indices %>%filter(survey_area =="GoM") %>%ggplot(aes(yr, b, group = area_titles)) +# geom_line(aes(group = area_titles), linewidth = 1) +geom_point(size =1) +geom_smooth(method ='loess', formula ='y ~ x', color ="orange") +#geom_pointrange(aes(group = area_titles, ymin = confMin, ymax = confMax), size = 0.25) +scale_color_gmri() +facet_wrap(~area_titles, ncol =1) +labs(x ="Year", y ="Abundance Size Spectrum Slope (b)", title ="Exponent of Abundance Size Spectra")# The decline in weight of groundfish to go with:# Average individual weightfunctional_group_size %>%filter(survey_area =="GoM", hare_group %in%c("groundfish", "elasmobranch")) %>%ggplot(aes(Year, mean_wt_kg)) +geom_point(size =1) +geom_smooth(color ="orange", method ="loess", linewidth =1) +facet_grid(hare_group~area_titles, scales ="free_y") +theme(axis.text =element_text(size =8)) +labs(title ="Average Weight",y ="Weight (kg)",color ="Region")# How much biomass is just gf and elsmo:# Plot abundance by region and Groupfgroup_area$weight_bins %>%filter(survey_area =="GoM") %>%ggplot(aes(Year, strat_lwbio_mill, fill = hare_group)) +geom_col(color ="gray30", linewidth =0.1, position ="fill") +facet_grid(.~survey_area) +scale_fill_gmri() +scale_y_continuous(labels =percent_format()) +labs(y ="Fraction of Biomass", x =NULL, fill ="Functional Group") +theme(legend.position ="bottom",axis.text.x =element_text(angle =45, hjust =1))# Table of averages over different periods for all of them:region_indices %>%filter(yr %in%c(1970:2000)) %>%mutate(year_range ="1970-2000") %>%group_by(year_range, area_titles) %>%summarise(avg_slope =mean(b, na.rm = T),sd_slope =sd(b, na.rm = T) )```# References